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Brief Reports 


The Journal of Consulting Psychology will 
accept Brief Reports of research studies in 
clinical psychology for early publication with- 
out expense to the author. The procedure is 
intended to permit the publication of soundly 
designed studies of specialized interest or lim- 
ited importance which cannot now be ac- 
cepted because of lack of space. Several pages 
in each issue will be devoted to Brief Reports, 
published in the order of their receipt with- 
out respect to the dates of receipt of the regu- 
lar articles. Most Brief Reports appear in the 
first or second issue to go to press following 
their final acceptance. 


An author who wishes to submit a Brief 
Report: 
1. Sends the Brief Report, limited to one printed 


page and prepared according to the specifications 
given below. 


2. Also sends to the Editor a full report of the re- 
search study, in sufficient detail to give a clear ac- 
count of its background, procedure, results, and con- 
clusions, which will be filed with the American 
Documentation Institute to insure indefinite avail- 
ability. 

3. Prepares at least 100 mimeographed copies of 
the full report, which the author will send without 


charge to all who request it as long as the supply 
lasts. 


4. Agrees not to submit the full report to another 
journal of general circulation. 


Specifications 

Brief Report. The Brief Report should give 
a clear, condensed summary of the procedure 
of the study and as full an account of the re- 
sults as space permits. 

To insure that the Brief Report will be no 
longer than one printed page, its typescript, 
including all matter except the title and the 
author’s lines, must not exceed 75 lines av- 


eraging 42 characters and spaces in length. 
Set the typewriter margins for short lines of 
42 characters, which are 3.5 inches long in 
elite typing, and 4.2 inches long in pica. 

The manuscript of the Brief Report must 
be double spaced throughout. Except for its 
short lines, it follows the standard style (1). 
Headings, tables, and references are avoided 
or, if essential, must be counted in the 75 
lines. Each Brief Report must be accom- 
panied by a footnote in the style below, 
which is typed on a separate sheet and not 
counted in the 75-line quota: * 

1An extended report of this study may be ob- 
tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. (giving the author’s full name 
and address), or for a fee from the American Docu- 
mentation Institute. Order Document No. ——, re- 
mitting $—— for microfilm or $—— for photo- 
copies. 

Extended report. Because the extended re- 
port is intended for photoduplication, and is 
not copy to be sent to a printer, its style 
should differ in several ways from that of 
other manuscripts: (a) The extended report 
should be typed with single spacing for 
economy in duplication. (6) Tables and fig- 
ures should be placed adjacent to the text 
which refers to them. A caption should be 
typed below each figure. (c) Footnotes should 
be typed at the bottom of the page on which 
reference is made to them. In other respects, 
the full report is prepared in the style speci- 
fied by the Publication Manual (1). 
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Editors. Publication manual of the American 
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The Wittenborn Psychiatric Syndromes: 
An Oblique Rotation 


Maurice Lorr 


Veterans Administration 1 


There is now substantial evidence that 
symptom syndromes among mental hospital 
patients are relatively stable. The same fac- 
tors appear despite differences in hospitals 
and patients sampled (1, 3, 7), differences in 
raters (8), and differences in rating schedules 
(4). However, there is by no means complete 
agreement in this domain. To some extent 
disagreements are due to the failure to include 
essential references or marker variables. For 
example, a compulsive-phobic syndrome can- 
not appear in a study in the absence of a 
suitable number of defining variables in the 
correlation matrix. Other differences, more 
easily reconciled, have their origin in theo- 
retical biases. Some workers systematically 
utilize an orthogonal framework for their ro- 
tations; most others use oblique or correlated 
frames of reference. 

It has not been uncommon in studies of 
cognitive and perceptual abilities to offer both 
oblique and orthogonal rotational solutions 
for a given body of data. The degree of fit to 
the data, the plausibility of inferred con- 
structs, and the extent of consistency with the 
results of other investigators may then be 
compared. It would seem worthwhile to ob- 
tain alternative solutions for a few of the 
more important psychiatric symptom studies 
available for the purpose of securing greater 
agreement and consistency from study to 
study. 

The purpose of the present investigation is 
to clarify certain of the factors identified by 
Wittenborn and Holzberg (9), and to deter- 
mine what second-order factors exist in this 
body of data. These authors, using an ortho- 


1From the Veterans Benefits Office, Washington, 
D. C. 


gonal frame of reference, identified a schizo- 
phrenic excitement factor which has not been 
found in other similar investigations. On the 
other hand, they failed to isolate a parameter 
of thinking disorganization or dissociation 
which has appeared in several analyses re- 
ported by Degan (1) and Lorr, Jenkins, and 
O’Connor (3). It is hypothesized that rotation 
of the Wittenborn-Holzberg data to an 
oblique frame of reference will result in the 
disappearance of the Schizophrenic Excite- 
ment factor, reveal a factor of thinking dis- 
organization, and improve the factor structure 
generally. 

The Wittenborn-Holzberg data are chosen 
for reanalysis because they were collected 
under controlled conditions, are comparable 
to other large investigations, and have been 
properly rotated. Psychiatrists rated 250 
newly admitted patients to a Connecticut 
state hospital over a six-month period on the 
basis of observations made during a standard 
period when no patient was under reatment. 
Alcoholic, senile, sclerotic, paretic, and psy- 
chopathic patients were excluded. 


Procedure 


The orthogonally rotated factor matrix re- 
ported (9) was rerotated blind by means of 
the single plane method with the object of 
attaining oblique simple structure. The cor- 
relations between the primary vectors result- 
ing from this analysis were then in turn fac- 
tored by the multiple group method, and ro- 
tated in accordance with the usual criteria. 
Finally, by means of a suitable transforma- 
tion, the correlations of the scales with each 
of the second-order factors were obtained. 
These correlations provide a much superior 
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Table 1 
Rotated Oblique Factor Matrix V 
(Decimal points are omitted) 
Factors 

Scale Description No. A B Cc D E F G 
Difficulty in sleeping 1 —21 05 56 36 19 00 —05 
Rapid change of ideas 2 —13 26 40 —06 —05 53 09 
Unjustified sexual beliefs 3 31 03 —06 01 03 21 15 
Obsessional thinking 4 15 —02 —09 10 —07 03 50 
Unrealistic self blame 5 —04 03 —03 67 Wb —01 01 
Gives in easily to others 6 —05 22 —4i1 26 24 —05 07 
Restless 7 —27 00 57 —13 —02 37 25 
Little concern for others 8 04 53 41 00 —06 01 02 
Use made of physical disease symptoms 9 03 —11 —08 14 57 18 —04 
Eats very little 10 —12 22 40 27 09 —21 —17 
Impudent or impolite 11 04 —01 56 —03 00 18 03 
Expresses irritation 12 13 —20 53 —05 04 21 11 
Avoids people 13 27 45 —04 45 02 —23 —07 
Talks loudly 14 —16 —08 $2 —20 03 39 28 
Behavior affected by phobias 15 00 00 (4 —01 03 —02 59 
Slovenly and unkempt 16 -21 40 53 20 09 —07 —08 
Engrossed in plans 17 —01 —01 00 0s 12 60 02 
Feelings of impending misfortune 18 17 —05 09 55 03 —03 07 
Conspicuously optimistic 20 06 00 12 —08 —10 63 —04 
Difficulty carrying out plans 21 —06 60 14 17 04 22 08 
Doubts he can be helped 22 —02 —01 03 56 04 —10 09 
Concerned with impression made on others 23 —01 —10 —23 17 18 22 25 
Thinking bizarre or obscure 24 49 34 12 iS —06 15 —09 
Little organic basis for complaints 25 —03 —01 —05 —-04 66 10 20 
Feels conspired against 26 61 —i1 05 01 —01 02 00 
Feels others control his behavior 27 63 06 —04 05 03 03 18 
Acutely distressed by anxiety 28 —01 —20 —03 20 21 —07 42 
Organic pathology with emotional basis 29 —07 —10 00 04 68 —02 02 
Concerned with orderliness 30 —02 —16 —31 02 27 18 15 
Attention-demanding 31 12 —03 19 —01 15 42 06 
Overt activity slowed or delayed 32 00 54 —08 34 03 —13 —01 
Grandiose notions 33 35 18 00 09 —03 34 —21 
Believes he has no psychological problem 34 12 27 31 —23 —10 05 —07 
Exhibits compulsive acts 35 —04 02 07 —04 —10 09 51 
Rate of speech variable 36 —17 24 44 —03 —05 25 19 
Belligerent or combative 37 04 04 71 08 01 —05 —03 
Mood changes frequent 39 —15 —06 4 #£«--O0i1 0s 27 28 
Suicidal thoughts or impulses 40 —03 —i1 06 46 —13 —14 —01 
Failures of affective response 41 26 64 —13 06 —03 —01 —04 
Little concern over physical handicaps 42 —02 —01 07 04 47 02 —16 
Difficulty in making decisions 43 —09 65 07 28 il —04 02 
Distorts facts to defend opinions 4h 33 03 31 12 —10 26 —17 
Has hallucinatory experiences 45 47 29 07 15 01 —06 02 
Life history memory poor 46 —07 48 19 02 13 —10 —10 
Fears committing abhorred act 47 11 00 —12 29 13 Od 13 
Words irrelevant to recognizable ideas 48 10 66 10 —04 —02 07 —01 
Anxiety affects task performance 49 04 il —06 33 18 16 14 
Forgets earlier insights 50 26 19 09 04 09 25 —02 
Speech is stilted 51 06 15 OF —03 —04 at 12 
Exaggerated affective responses S4 —16 —12 4 10 07 37 06 
Reluctant to conform or cooperate 55 —01 30 68 13 11 —03 00 
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base for making inferences concerning the 
second-order factors than the correlations of 
the primary vectors with the second-order 
reference vectors. 


The First-Order Factors 


Interpretation of each factor is made in 
terms of variables correlated + .30 or higher 
with the reference axis except where stated 
otherwise. 

The first factor, A, as shown in Table 1, 
is clearly identical with the Paranoid Schizo- 
phrenic factor originally identified (9). Only 
the tendency to repudiate earlier insights or 
admissions (No. 50) is missing. Factor A is 
essentially the same as the Paranoid Projec- 
tion factor isolated by Lorr, Jenkins, and 
O’Connor (3). 

The original Wittenborn-Holzberg Schizo- 
phrenic Excitement factor is defined by 16 
scale variables with factor loadings of .35 or 
higher. These variables, in order of size of 
factor loading, are as follows: reluctance to 
cooperate, slovenly and unkempt, little con- 
cern for others, combative, difficulty in sleep- 
ing, eating little, difficulty in carrying out 
plans and making decisions, variable rate of 
speech, irrelevant speech, poor memory, rapid 
change of ideas, restless, impudent, denies 
having problem, and is not orderly. As hy- 
pothesized, the oblique Factor B isolated here 
eliminates all of the elements of excitement. 
Factor B is defined by the following in order 
of size of factor loading: words irrelevant to 
recognizable ideas, difficulty in making deci- 
sions, failures of affective response, difficulty 
in carrying out plans, little concern for others, 
overt activity slowed or delayed, life history 
memory poor, avoids people, and slovenly and 
unkempt. Bizarre or obscure thinking and a 
reluctance to conform or cooperate also show 
minor correlations with B. 

As hypothesized, B represents a disorgani- 
zation of thinking. However, it also includes 
an element of social withdrawal. B resembles 
most the factor of Conceptual Disorganization 
(3) which is marked by irrelevant and inco- 
herent speech, thought-feeling disharmony, 
stereotyped language, speech blocking, slowed 
ideation, self preoccupation, and disorienta- 
tion for time. However, cross identification is 
not entirely certain because overlap in the 


variables included in the two studies com- 
pared is only partial. Guertin’s (2) Confused 
Withdrawal also resembles Factor B. 

Wittenborn and Holzberg label their third 
factor Manic-Depressed and their sixth factor 
Paranoid Condition. As a result of the oblique 
rotations, these two factors undergo consider- 
able change, and two excitement parameters 
emerge. A comparison of columns C and F 
reveals that the two faciors have in common 
the following variables: rapid change of ideas, 
restless, talks loudly, rate of speech variable, 
mood changes frequent, and exaggerated af- 
fective response.’ 

Factor C is distinguished from F by the 
following: belligerent or combative, reluctant 
to conform or cooperate, difficulty in sleeping, 
impudent or impolite, expresses irritation, 
slovenly and unkempt, does not give in easily 
to others, little concern for others, and eats 
very little. Implied here is an Excitement with 
a Hostile Belligerence. Degan’s Hyper-irrita- 
bility and Lorr et al.’s Belligerence factors 
closely resemble Factor C but do not include 
the aspects of excitement to be found here. 

Factor F is distinguished from C by ex- 
pansive features such as the following: con- 
spicuously optimistic, engrossed in plans, at- 
tention-demanding, speech stilted, and gran- 
diose notions. The tendency to distort facts to 
defend opinions has a doubtful correlation of 
.26 with Factor F. Otherwise, there is no evi- 
dence of any paranoid element in this factor. 
The parameter underlying F appears to be an 
Excitement with Expansiveness. F is most 
similar to and perhaps identical with Degan’s 
Manic Hyper-excitability which is defined by 
excitement, destructiveness, and euphoria. 

An examination of the variables showing 
small negative correlations with F reveals a 
possible depressive pole. The pole is charac- 
terized as follows: avoids people, eats very 
little, overt activity slowed or delayed, and 
suicidal thoughts or impulses. This suggests 
that Factor F represents Wittenborn’s Manic- 
Depressed factor with the hostile belligerent 
elements removed and depressed pole reduced 
in importance. 

The fourth factor, D, is essentially the 

2The assistance of Richard L. Jenkins, M.D., in 


the interpretation of the factors is gratefully ac- 
knowledged. 
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442 Maurice Lorr 
Table 2 
Correlation of Scales with Second Order Factors 
(Decimal points are omitted) 
Factors 
Scale Description No. x Y Z 
Difficulty in sleeping 1 18 21 —02 
Rapid change of ideas 2 01 33 01 
Unjustified sexual beliefs 3 09 21 10 
Obsessiona! thinking 4 42 05 07 
Unrealistic self-blame 5 44 —08 17 
Gives in easily to others 6 18 —21 —07 
Restless 7 09 33 02 
Little concern for others 8 03 21 —32 
Use made of physical disease symptoms 9 —04 18 12 
Eats very little 10 07 06 —19 
Impudent or impolite 11 00 38 02 
Expresses irritation 12 03 a4 14 
Avoids people 13 25 —03 —18 
Talks loudly 14 05 39 06 
Behavior affected by phobias 15 40 07 —05 
Slovenly and unkempt 16 08 15 —28 
Engrossed in plans 17 —01 24 25 
Feelings of impending misfortune 18 40 09 19 
Conspicuously optimistic 20 —10 30 25 
Difficulty carrying out plans 21 15 12 —22 
Doubts he can be helped 22 43 —06 12 
Concerned with impression made on others 23 24 01 18 
Thinking bizarre or obscure 24 03 30 —02 
Little organic basis for complaints 25 01 19 —06 
Feels conspired against 26 —01 31 14 
Feels others control his behavior 27 14 30 06 
Acutely distressed by anxiety 28 40 03 10 
Organic pathology with emotional basis 29 —06 14 —04 
Concerned with orderliness 30 06 —02 16 
Attention-demanding 31 —01 35 16 
Overt activity slowed or delayed 32 22 —12 —25 
Grandiose notions 33 —11 25 12 
Believes he has no psychological problem 34 —18 21 —20 
Exhibits compulsive acts 35 34 07 —01 
Rate of speech variable 36 12 24 —10 
Belligerent or combative 37 05 36 —08 
Mood changes frequent 39 18 30 06 
Suicidal thoughts or impulses 40 32 —10 15 
Failures of affective response 41 01 02 —29 
Little concern over physical handicaps 42 —16 14 —04 
Difficulty in making decisions 43 18 —02 —33 
Distorts facts to defend opinions 44 — 04 36 16 
Has hallucinatory experiences 45 11 22 —10 
Life history memory poor 46 —07 05 —34 
Fears committing abhorred act 47 25 02 11 
Words irrelevant to recognizable ideas 48 —03 11 —35 
Anxiety affects task performance 49 27 08 09 
Forgets earlier insights 50 —02 27 02 
Speech is stilted 51 04 21 09 
Exaggerated affective responses 54 09 31 18 
Reluctant to conform or cooperate 55 08 35 —22 
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same as Wittenborn’s Anxiety. It is charac- 
terized by the following: unrealistic self 
blame, feeling of impending doom, doubts he 
can be helped, suicidal thoughts or impulses, 
avoids people, difficulty in sleeping, overt ac- 
tivity slowed or delayed, and anxiety affects 
task performance. Factor D seems identical 
with Degan’s Depression syndrome and Lorr, 
Jenkins, and O’Connor’s Melancholy Agita- 
tion. 

The fifth and seventh factors isolated here 
are identical with the Conversion Hysteria 
and Phobic-Compulsive reaction factors of 
the original study. Not only are the defining 
variables identical but the factor loadings are 
virtually the same in the two analyses. 


The Second-Order Factors 


The rotated second-order factors are nearly 
orthogonal. The cosines of the angles between 
the three reference vectors are .04, .00, and 
.00.8 

The first of the second-order factors, X, as 
shown in Table 2 is characterized by: unreal- 
istic self blame, a feeling of hopelessness, ob- 
sessive thoughts, disrupting phobias, feelings 
of impending misfortune, acute anxiety, com- 
pulsive acts, and suicidal thoughts or im- 
pulses. Latent here is a pathological intro- 
punitiveness or self-directed hostility common 
to Melancholy Agitation and the Phobic- 
Compulsive reaction. 

Factor Y is best defined by irritability, loud 
speech, impudence, belligerence or combative- 
ness, distortion of facts to defend opinions, 
attention-demanding behavior, and reluctance 
to cooperate. Other lesser elements are para- 
noid tendencies and an expansive optimism. 
Factor Y is most similar to Degan’s second 
order factor Mania, and represents a condi- 
tion of excitement sometimes expressed in a 
hostile fashion and sometimes in a grandiose 
expansive fashion. 

The scale variable correlations with Factor 
Z are all .35 or less which implies that Z may 
represent a residual plane. The irrelevant 


8 To save printing costs, the transformation matrix, 
the matrix of correlations between the primary fac- 
tors, and the rotated second-order factors matrix 
have been deposited with the American Documenta- 
tion Institute. Order Document No. 5363, remitting 
$1.25 for microfilm or $1.25 for photocopies. 
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speech, poor memory for life history, difficulty 
in making decisions, little concern for others, 
failure of affective response, slovenly and un- 
kempt appearance, and slowed overt activity 
all suggest a personality disorganization. 
However, the significance of this factor seems 
doubtful. 

Discussion 


The factor structure that emerges from the 
oblique rotations is rather well defined. Why 
then, is there not greater agreement with the 
findings of other investigators? First, it may 
be that additional meaningful factors could 
have been extracted. Others have isolated as 
many as 9 and 11 factors (3, 4). Incomplete 
factor extraction would tend to obscure both 
the first- and second-order factors identified. 
Another restriction is the extent of overlap in 
the variables represented in the various stud- 
ies reported. A third limitation lies in the ab- 
sence of essential reference variables to define 
factors likely to be present (an inference 
based on studies published subsequent to 
Wittenborn’s). The Wéittenborn Psychiatric 
Rating Scales do not explicitly include man- 
neristic movements or postures, stereotyped 
words or phrases, or separate scales for visual 
and auditory hallucinations. In the absence of 
these distinctive reference variables to define 
a factor of Perceptual Distortion or of Motor 
Disturbance, no common patterns can ap- 
pear; they are lost in the unique variance. 
These seem to be some of the conditions re- 
stricting the findings from this study. 


Summary 


The orthogonally rotated Wittenborn-Holz- 
berg data descriptive of 250 psychotic pa- 
tients was re-rotated obliquely for the purpose 
of (a) clarifying the factor structure and (>) 
identifying any second-order factors that 
might be present. It was hypothesized that a 
factor of Schizophrenic Excitement would 
disappear and a factor of Conceptual Disor- 
ganization would be identified. 

Four of the factors isolated, Paranoid 
Projection, Melancholy Agitation, Conversion 
Hysteria, and Phobic-Compulsive reaction, 
were found to be indistinguishable from those 
originally isolated. The factors of Schizophre- 
nic Excitement, Manic-Depressed, and Para- 
noid Condition were replaced by an Excite- 
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ment with Hostile Belligerence, an Excite- 
ment with Expansiveness, and a factor of 
Thinking Disorganization. The three second- 
order factors isolated were interpreted as 


Morbid Intropunitiveness, Manic Excitement, 
and Personality Disorganization. 


Received February 15, 1957. 
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Rotational Procedures and Descriptive Inferences 


J. R. Wittenborn 


Rutgers University 


After the writer and his colleagues had 
published a series of factor analytic studies 
of psychiatric symptoms (5, 6, 8, 9, 16), sev- 
eral inquiries were received about the possible 
application of oblique methods of rotation to 
these studies. Accordingly, the set of oblique 
rotations that Lorr has published for our 
largest and most homogeneous sample is of 
particular interest (1). Since we never pub- 
lished oblique rotations for these studies, the 
writer takes this opportunity to offer a few 
comparative comments concerning orthogonal 
and oblique rotations and to indicate why 
oblique rotations were not considered desir- 
able in our analyses of the symptom rating 
scales. 

The published series of factor analyses were 
instrumental to the preparation of a quanti- 
tative procedure for the multiple descriptive 
diagnosis of patients suffering from severe 
psychological disorders (2, 20). We were in- 
terested in developing such a procedure to 
provide criteria scores whereby we might ex- 
amine the descriptive implications of certain 
psychological tests and whereby we might also 
evaluate symptom changes as a result of vari- 
ous treatment procedures (2, 5, 10, 11, 12, 
18, 19). Since no suitable theory of the psy- 
choses provided intervening variables which 
could be used as a basis for the desired quan- 
tifications, we were obliged to rely upon de- 
scriptive concepts as a basis for our quanti- 
fications. It was hoped that a limited number 
of descriptive concepts could be inferred in 
such a manner that they would have the fol- 
lowing characteristics: 


1. They would describe the symptoms that psychi- 
atrists consider important in evaluating their patients. 

2. They would be as few as possible in number and 
yet be a descriptively adequate summary for the 
standard set of symptoms. 


3. They would be plausible in nature and, if pos- 
sible, refer to existing descriptive concepts. 

4. Their relationships with each other would be 
comprehensible (all other things equal, orthogonally 
related concepts would be more comprehensible than 
intercorrelated or obliquely related concepts). 


In brief, we found that it was possible to 
build a practicable set of 55 symptom rating 
scales (20). The intercorrelations among the 
symptom rating scales could be adequately 
expressed in terms of nine independent factors 
(the residual correlations were not reliable or 
consistently interrelated). These nine inde- 
pendent factors could be orthogonally rotated 
to reveal symptom clusters which were remi- 
niscent of classical diagnostic concepts, in- 
cluding the manic-depressive bipolarity (4, 
6). Thus the inferred descriptive concepts not 
only had the advantage of being familiar 
(such plausibility should not be considered 
any more likely for orthogonal than inde- 
pendent rotations, however), but they were 
independent of each other. 

In view of our purposes, it is easy to see 
why we were satisfied with our orthogonal 
rotations and chose to emphasize them in our 
publications. The reader may be interested, 
however, in the reasons which specifically led 
us to avoid the use of oblique rotations in 
ceveloping our procedure: 


1. Trying to describe patients in terms of 
several independent abstract categories is 
difficult enough; trying to describe them in 
terms of complexly interdependent categories 
is likely to be discouraging to most practical 
psychologists who have little interest in quan- 
titative refinements. 

2. We had applied an oblique procedure to 
one of our studies and found the results less 
intelligible than the results of the orthogonal 
procedure. That is, the factors yielded by the 
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oblique procedure were not as familiar in our 
clinical perceptions as the orthogonal factors, 
and the manic-depressive bipolarity was ob- 
scured. 

3. We desired to infer constructs which 
would have descriptive merit for several dif- 
ferent samples (and analyzed several different 
samples with this purpose in mind [4, 6, 8, 
9, 16]). In view of this broad interest, it was 
feared that descriptive inferences that re- 
flected the interrelationships among clusters 
in a particular sample could have less general 
descriptive merit than inferences based on a 
rotational procedure which showed the gen- 
eral organization of symptoms and did not 
overdescribe one sample at the cost of under- 
describing another. 

4. We were not interested in the interrela- 
tionships among exactly defined clusters. Al- 
though a study of such interrelationships can 
be the proper subject for investigation, it was 
considered more practical to be interested in 
the interrelationships of the scores generated 
by our completed descriptive procedure (20) 
than to be interested in the possible inter- 
relationships among the symptom clusters 
from which our descriptive concepts were 
inferred. 

5. Various samples were chosen to reveal 
the general nature of the inferences which 
could be most descriptively valuable. The 
samples were not selected to provide a valid 
basis for confirming the kinds of interrelation- 
ships which might be found among symptom 
clusters. 

Lorr comments that the results of factor 
analyses will vary from study to study as a 
result of differences in the intercorrelated 
variables (1). It should be added that, if the 
samples of subjects which provide the data 
vary from each other with respect to the vari- 
ety and organization of their behavior, the 
results of factor analyses will vary from study 
to study despite the uniformity of the inter- 
correlated variables. Indication of this is pro- 
vided in our studies which permit a compari- 
son of old (8) and young patients (8), newly 
admitted fulminating patients (6), and 
chronic patients (4). All of these groups pro- 
vided a slightly different basis for descriptive 
inference despite the fact that the same set 
of rating scales was used. 


With respect to Lorr’s interest in the dif- 
ference between his oblique rotation of the 
Connecticut factors and other oblique rota- 
tions of other symptom data, the writer would 
suggest that the differences might be due 
either to the differences in the group of symp- 
toms intercorrelated or differences in the vari- 
eties of symptoms which characterize the sam- 
ple. In this connection, it should be noted that 
the Connecticut sample comprised relatively 
young patients without organic disorder (6). 
They were newly admitted patients and for 
the most part in the “fulminating” stage of 
their disorder. We have found the concept of 
schizophrenic excitement very useful in de- 
scribing such patients (6, 19). The concept of 
hebephrenic (or deteriorated) schizophrenic is 
useful in describing patients who have been 
hospitalized for a long time (4). Our proce- 
dure was not intended to be exhaustively de- 
scriptive of patients’ disorders. Our sample of 
symptom rating scales emerged from confer- 
ences with numerous psychiatrists and was 
considered by them to comprise the most im- 
portant symptoms for describing patients who 
have severe functional disorders but who are 
not yet far advanced in their illness (4). The 
descriptive implications of the symptoms have 
been extensively and intensively explored, 
however (7, 13, 14, 15, 17, 19). 

This comparison serves to remind us that 
the value of inferences can be better judged 
on the basis of the purposes they serve than 
on the basis of the manner in which they are 
derived. This comment is particularly perti- 
nent when we compare rotational procedures 
which, regardless of their outcome, are equally 
adequate for the description of the originating 
data. 


Reply. 
Received April 18, 1957. 
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Orthogonal Versus Oblique Rotations 


Maurice Lorr 
Veterans Administration, Washington, D. C. 


In psychology, as in any scientific field, the 
observed data can be interpreted in a number 
of ways. The investigator is free to postulate 
a model and demonstrate that it is consistent 
with experimental evidence. Other investiga- 
tors may postulate alternative models or theo- 
ries also compatible with the basic data. The 
choice of a particular model will thus depend 
upon criteria such as fruitfulness for further 
investigation, consistency with already devel- 
oped theory, simplicity, geometric fit to the 
data, and plausibility. 

Once the factors have been extracted in a 
factorial experiment the investigator must de- 
cide whether to utilize an orthogonal or an 
oblique framework for the factors. Orthogonal 
rotations have the theoretical advantage of 
simplicity. Each factor may be considered to 
be independent of all others. Orthogonality 
also has the advantage if the scores are to be 
described statistically in terms of the factors. 
However, if the scores used are simply a 
weighted combination of ratings on the scales 
defining a cluster, as they are for the Witten- 
born rating schedule, then there is no advan- 
tage over the oblique factors. The solution is 
thus hypothetical since the variables are rarely 
independent statistically. In fact, the gain in 
understandability of the orthogonal factor 
would also seem to be doubtful. Most investi- 
gators certainly express no difficulty in inter- 
preting ordinary test scores which are typi- 
cally correlated with other tests in a battery. 

Oblique rotation has recently gained greater 
acceptance among factor analysts since it 
seems likely that psychological variables are 
rarely independent. On a priori grounds it is 
plausible to expect that the underlying dimen- 
sions in this domain would be correlated. 
Moreover, simple structure is best obtained 
by oblique rotation without the restrictive as- 
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sumption of orthogonality and a considerably 
better fit of the data to the model is thereby 
secured. An examination of the two-dimen- 
sional plots of the Wittenborn orthogonally 
rotated factors supports this argument. The 
Schrizophrenic Excitement reference vector is 
poorly defined in most of the plots. In this 
particular instance, the orthogonal framework 
does not provide a satisfactory geometric fit 
to the data. 

Since fewer assumptions are made con- 
cerning the data and the simple structure 
achieved is better, it is plausible to suppose 
on logical grounds that an oblique structure 
would be more stable from sample to sample 
than an orthogonal structure. Unfortunately 
there are little data to check this conjecture. 

For some, the principal basis for preferring 
oblique rotations lies in the more general sec- 
ond-order factors descriptive of the correla- 
tions obtaining between the more elementary 
factors. The second-order factors provide ten- 
tative definitions of more inclusive response 
variables. There is already much evidence in 
the literature indicative of such factors in the 
domain under discussion. An extropunitive- 
intrapunitive parameter has been repeatedly 
identified (1, 2, 3). Others are factors of 
thinking disorganization, of social withdrawal 
with motor disturbances, and manic excite- 
ment. If the construct validity of these can be 
further established their intrinsic interest is 
greater than that of the primary factors. 

Wittenborn’s aim was to identify factors 
conforming as far as possible to present psy- 
chiatric syndromes which were, at the same 
time, as few in number as possible. In our own 
work we have explicitly hypothesized certain 
constructs that grew out of clinical experience 
and theory. The data were then used to test 
whether or not these constructs were sup- 
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ported in fact. We viewed the psychiatric 
diagnostic groupings as classes of patients, 
not as syndromes. We were interested instead 
in the parameters underlying symptoms and 
deviant behavior. Establishment of distin- 
guishable patient groups we regarded as a 
separate problem. 

In summary, we have given greater weight 
to that mathematical model which required 
the fewest assumptions and which yielded the 
best simple structure fit to the data. Our aim 
has been to identify parameters of psychopa- 
thology of some generality regardless of their 
possible use in clinical practice. It is this dif- 


ference in goals and differences in models that 
the reader must consider in making his choice. 


Rejoinder. 
Received June 26, 1957. 


References 


1. Degan, J. W. Dimensions of functional psychosis. 
Psychometr. Monogr., 1952, No. 6. 

2. Lorr, M., Jenkins, R. L., & O'Connor, J. P. Fac- 
tors descriptive of psychopathology and be- 
havior of hospitalized psychotics. J. abnorm. 
soc. Psychol., 1955, 50, 78-86. 

3. Lorr, M., O’Connor, J. P., & Stafford, J. W. Con- 
firmation of nine psychotic symptom patterns. 
J. clin. Psychol., 1957, 13, 252-257. 


— 


Sak 
ok: 
4 
a 
A 
| 
| 
if 
l 
| 
y 
f 
| | 
is 
y- 
- 
in 4 
ce 
st 
| 


Journal of Consulting Psychology 
Vol. 21, No. 6, 1957 


The Social Desirability Stereotype and Sotic 
Measures of Psychopathology’ 


C. James Klett and Arthur S. Tamkin 
VA Hospital, Northampton, Massachusetts 


In recent studies (2, 3) of the stability of 
the social desirability scale values of the items 
comprising the Edwards Personal Preference 
Schedule (1), striking agreement in judg- 
ments of what constitutes socially desirable 
or undesirable behavior was found between 
college and high school students (r = .93), 
college students and hospitalized mental pa- 
tients (ry = .88), and high school students and 
mental patients (r = .87). In spite of this 
high overall agreement, there were some sys- 
tematic and significant differences among the 
three groups in terms of specific subscales 
(psychological needs) included in the 140 
items. No essential differences were found 
when the hospital group was broken down 
into a psychotic and a nonpsychotic group, 
however (r = .90). 

In an attempt to find some means of sepa- 
rating this patient group into subgroups show- 
ing more heterogeneity in the social desirabil- 
ity stereotype, 84 of the 118 patients who had 
scorable MMPIs were sorted into three differ- 
ent dichotomies using the medians on the 
Barron Ego Strength Scale (Zs), the Pathol- 
ogy Scale (P}, and the Social Desirability 
Scale (SD) from the MMPI as criteria. These 
dichotomies were then evaluated for hetero- 
geneity of the stereotype by correlation and 
for systematic differences in subscales by 
means of the sign test. Another dichotomy 
made up of two modal groups formed by the 
combination of the three dichotomies was also 
examined. 

1 An extended report of this study may be obtained 
without charge from Dr. C. James Klett, Veterans 
Administration Hospital, Northampton, Mass., or 
for a fee from the American Documentation Institute. 


Order Document No. 5362, remitting $1.75 for mi- 
crofilm or $2.50 for photocopies. 


Although considerable homogeneity of the 
stereotype was present in all comparisons, the 
dichotomies formed by P and by SD showed 
significantly less homogeneity than the other 
dichotomies (r = .68 and .71 respectively). 
In the sign test analysis it was found that 
patients high on P and/or low on SD judged 
need Endurance items to be significantly less 
socially desirable and need Heterosexuality 
items to be significantly more socially desir- 
able than the corresponding low P and high 
SD groups. The high Es group judged need 
Achievement and need Endurance items to be 
significantly more socially desirable and need 
Heterosexuality and need Autonomy items to 
be significantly less socially desirable than the 
low Es group. It was felt that both the in- 
creased heterogeneity in the social desirability 
stereotype and the particular subscale differ- 
ences found were accounted for most parsi- 
moniously by assuming that the MMPI scales 
had separated the subjects on the basis of 
their willingness to conform to group judg. 
ments about what is socially desirable. This 
may or may not be related to degree of 
psychopathology. 


Brief Report. 
Received July 17, 1957. 
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A Factor-Analytically Based Rationale for the 
Wechsler Adult Intelligence Scale’ 


Jacob Cohen 
Franklin D. Roosevelt Veterans Administration Hospital and New York University 


In addition to its utility in the charting of 
a new domain, factor analysis is an important 
tool of psychometric analysis, one which can 
provide the clinician with insight into what his 
tests are measuring. 

From a previous factor-analytic study of the 
Wechsler-Bellevue in psychoneurotic, schizo- 
phrenic, and brain-damaged patients (2), 
there emerged a rationale for the Wechsler- 
Bellevue as applied to neuropsychiatric pa- 
tients (1). In addition to a description of the 
measurement function of each subtest, two 
major conclusions were drawn: (a) although 
the same factors emerged from the three 
groups, -some of the subtests had different 
measurement functions in the different groups, 
and (0) the rationales in the literature which 
attribute a high order of specificity of meas- 
urement to the subtests are unjustified in the 
light of their relatively high communalities 
and relatively low reliabilities (1, p. 277; 2, 
p. 365). 

With the publication of the Wechsler Adult 
Intelligence Scale (WAIS) (4, 8) which is a 
revision and restandardization of Form I of 
the Wechsler-Bellevue, it became possible to 
perform a comparative factor analysis of nor- 
mals over a wide age-range. This was done, 
and has been separately reported (3). The 
groups studied were ages 18-19 (N = 200), 
25-34 (N = 300), 45-54 (N = 300), and 
60—over 75 (N = 352). It was found that the 


1 From the Psychology Service, Franklin D. Roose- 
velt Veterans Administration Hospital, Montrose, 
New York. The author gratefully acknowledges the 
cooperation of Drs. Leon L. Rackow, Manager, 
George Rosenberg, Director of Professional Services, 
Oskar Diethelm, Chairman of the Dean’s Commit- 
tee, and Seymour G. Klebanoff, Chief, Psychology 
Service. 


same major factors were operative over the 
entire age range, and, moreover, that these 
were essentially the same factors as had been 
identified for neuropsychiatric populations on 
the Wechsler-Bellevue,? as follows: 

Factor A—Verbal Comprehension—V ocabu- 
lary richness and verbal-symbolic manipula- 
tive ability. 

Factor B—Perceptual Organization—The 
organization of nonverbal, visually perceived 
material against a time limit. 

Factor C—Memory—This involves a re- 
naming of what was previously called Free- 
dom from Distractibility (1, 2), not a differ- 
ent factor (3). It involves both immediate 
memory as well as the efficiency with which 
previously learned material can be called up 
when needed. 

Two minor factors, both of which loaded 
only one subtest consistently, were found. 
These have no analogues in the previous study 
(1, 2), but their consistent appearance in this 
study argues strongly for their “reality.” Fac- 
tor D is a Picture Completion specific found 
in all four groups and Factor E is a Digit 
Symbol specific found in all but the oldest 
group. 

Finally, a second-order factor was found 
and, as was the case in (2), was interpreted 
as present general intellectual functioning and 
identified as G. It is a strong factor, account- 
ing for about half the total variance of the 
subtests. 


2 The names given the factors differ, but their com- 
position is essentially the same. Such differences as 
do emerge are interpreted as being the consequence 
of a refinement in the factor-analytic method made 
possible by larger samples and cross checking among 
groups (3). 
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The factor analysis described in (3) was a 
standard one: centroid extraction and oblique 
rotation to simple structure in each of the 
four groups. For the purpose of adducing a 
rationale for the subtests, a supplementary 
analysis was performed. This analysis, after 
Thorson (6, pp. 189-191), provides a factor 
table in which the correlation of each sub- 
test is given with G and also with the com- 
mon factor-specifics, i.e., what is specific to 
each factor after G has been removed. Since 
G and the factor-specifics are mutually inde- 
pendent in this analysis, the square of each 
subtest-factor correlation gives the proportion 
of the variance of the subtest attributable to 
the factor in question. 

Such an analysis was performed for each of 
the four age groups of the standardization 
population. It was found in (3), however, 
that with one major exception, the four groups 
were quite similar with regard to the factorial 
composition of the subtests. The exception 
was that for the aged group (60-over 75), 
the Memory factor involves several of the 
verbal subtests, and that this increase in vari- 
ance attributable to Memory is accompanied 
by a reduction in the variance taken up by G 
(3, Table 6). Because of this change which 
occurs with advanced age, the interpretation 
of several subtests (primarily verbal) is af- 
fected. Accordingly, in Table 1 the entries 
represent averages by means of the Fisher z 


Table 1 


Mean Correlations Between WAIS Subtests and Factors 
of the 18-19, 25-34 and 45-54 Age Groups* 


Subtest G A B Cc 


D E 
Comprehension 72 38 03 02 06 —09 


Arithmetic 71 06 05 32 604 —05 
Similarities 77 27 —05 —02 15 08 
Digit Span 62 —08 -—03 29 @2 15 


Digit Symbol 655, 12 28 
P.Completion 75 05 07 -—02 32 W 
Block Design 70-01 OF O2 O1 
P.Arrangement 70 07 2 —O1 08 10 


Percentage of 
Variance 52.3% 4.6% 4.7% 21% 18% 1.5% 


* Decimal points omitted. Values of .20 and greater italicized. 


Table 2 


Correlations Between WAIS Subtests and 
Factors of the 60-Over 75 Age Group* 


, Subtest G A B ¢ D 


Information 73 29 40 07 
Comprehension 60 39 01 37 —07 


Arithmetic 63 04 05 51 05 
Similarities 65 42 09 10 —0O1 
Digit Span 48 00 08 44 02 
Vocabulary 66 37 —02 50 —06 


Digit Symbol 71 —06 26 21 13 
P. Completion 75 07 04 01 25 
Block Design 65 00 53 09 00 
P. Arrangement 65 30 18 —09 07 
O. Assembly 58 —0O1 51 —03 02 


Percentage of 
Variance 42.1% 59% 61% 9.7% 0.9% 


* Decimal points omitted. Values of .20 and greater italicized. 


transformation of r (5, pp. 132-133) over the 
three younger groups (18-19, 25-34, and 45-— 
54), and Table 2 gives similar data for the 
aged group (60-over 75). 

The entries in the tables represent the cor- 
relation of the subtest with G and each inde- 
pendent factor-specific. The square of each 
entry is the proportion of that subtest’s vari- 
ance attributable to the factor in question. At 
the foot of each column is given the percent- 
age of the total variance of the subtests at- 
tributable to the factor. 


Information 


For the younger groups, Information shares 
with Vocabulary the distinction of being the 
best measure of G among the WAIS subtests, 
with a correlation of .83 (see Table 1). It 
also consistently measures Verbal Compre- 
hension, but does so to a somewhat lesser 
degree than do Vocabulary and Comprehen- 
sion. 

In the aged group, Information continues 
to be a relatively good measure of G, but, 
consonant with the general decline of G vari- 
ance for these subjects (Ss), its correlation 
with G falls to .73 (see Table 2). When this 
difference is expressed in terms of the per- 
centage of its variance attributable to G, its 
magnitude is appreciable, 69% to 53%. 

This subtest also illustrates the importance 
of the Memory factor in the age group 60— 
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over 75. Although Information continues to 
correlate with the Verbal Comprehension fac- 
tor, it is more highly correlated with the 
Memory factor for these aged subjects, in 
variance terms the comparison being Factor A 
8% and Factor C 16%. The factorial com- 
plexity of this subtest for the aged results in 
ambiguity of interpretation. A given subtest 
score can be the result of many combinations 
of levels of verbal and memory ability. Thus, 
a poor Information score for an aged S may 
result either from poor life-long verbal abil- 
ity or senescent decline in memory ability. 

Whatever Information measures validly 
other than G and Verbal Comprehension (and 
Memory for the aged), it can do so with only 
about 10% of its variance (3, Table 7). 
With specificity so low, interpretations of In- 
formation along lines other than those above 
are necessarily hazardous. 

In summary, Information among the WAIS 
subtests provides the all-around best measure 
of G over the entire age range, fully justifying 
Wechsler’s faith in it (7, pp. 77-78). It is not, 
however, a good choice to use as a measure 
of Verbal Comprehension, both because other 
tests surpass it in this regard and because of 
its complexity in aged Ss. 


Comprehension 


This subtest holds median rank as a meas- 
ure of G for younger Ss (.72), but shares first 
rank with Vocabulary in providing the best 
single measure of the Verbal Comprehension 
factor (see Table 1) and measures no other 
common factor. 

In the oldest group, its loading of .60 on G 
is low both absolutely and relatively, ranking 
third from the lowest among the subtests (see 
Table 2). With regard to common factor meas- 
urement, it suffers the same fate of ambiguity 
as does Information: it responds to individual 
differences in memory ability as well as in 
verbal ability. In the case of Comprehension, 
the proportions attributable to the two factors 
are about equal, Factor A 15% and Factor C 
14%. 

The specificity of Comprehension averages 
to the same value as for Information, 10% 
(3, Table 7). Although it was found that 
Comprehension on the Wechsiler-Believue may 
have specific variance tied to “judgment” for 


schizophrenics (1, p. 273), its revised form 
in the WAIS has such low specificity for nor- 
mals, that singling it out of a record for 
unique interpretation is unjustified. 

The major utility of Comprehension, then, 
is as a measure of the Verbal Comprehension 
factor from early adulthood through middle 
age. It is not a good measure of G over the 
entire age range, and is an ambiguous meas- 
ure of A and C for old Ss. 


Arithmetic 


The picture which emerges in the analysis 
of this subtest’s measurement characteristics 
for Ss up to middle age is that it is a mediocre 
measure of G, and beyond this a “pure,” al- 
though weak, measure of the Memory factor 
(see Table 1). Since G accounts for an aver- 
age of 50% of its variance, and the Memory 
factor an average of 10%, much of the vari- 
ance is yet to be identified. From the reli- 
ability coefficients given in the manual (8, 
p. 13), it is estimated that about 19% of 
the subtests variance is specificity, and this 
amount is among the largest obtained among 
the subtests. With specificity variance about 
twice as great as Memory factor variance, 
judgment about Memory ability from this 
test alone is rather risky. 

For the aged group. the picture changes 
somewhat. G measurei ent is reduced (r = 
.63), while the Factor C correlation goes up 
to .51 (see Table 2). Unfortunately, the reli- 
ability coefficients for the subtests are not 
available for this group, so that determina- 
tion of the specificity of Arithmetic is not pos- 
sible. Nevertheless, since Memory variance is 
so high (26%), one can more safely attribute 
scores to this factor in the aged group, after 
making due allowance for G. 

Thus, Arithmetic is a mediocre measure of 
G (compared to other subtests), and a pure 
measure of memory ability, weakly for the 
younger three groups and strongly for the 
aged group. 


Similarities 
The Similarities subtest in younger Ss ranks 
third among the subtests in correlation with 
G (.77), and although it correlates to a ma- 
terial degree only with Factor A, its correla- 
tion is small both absolutely (.27) and rela- 
tively (see Table 1). Its average specificity is 
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17% (3, Table 7), which is higher than for 
any of the other subtests correlating with 
Factor A. The combination of relatively high 
specificity and low Verbal Comprehension 
variance makes it an untrustworthy measure 
of the latter when used by itself. 

On the other hand, for aged patients, it has 
measurement characteristics that recommend 
it to the clinician. It is the only verbal subtest 
which correlates with Factor A (.42) and does 
not also correlate with the Memory factor (see 
Table 2). After allowing for its G measure- 
ment (.65), it can therefore be used as a 
measure of Verbal Comprehension, uncon- 
taminated by variance in the Memory factor. 
(Its correlation with G is median among the 
subtests. ) 

In summary, while Similarities has nothing 
particular to commend it up through middle 
age, among aged Ss it has the property of 
being a measure of Verbal Comprehension un- 
affected by individual differences in memory 
ability. 

Digit Span 

In the younger three groups, Digit Span is 
the poorest measure of G (.62), and a “pure” 
but weak measure of the Memory factor (see 
Table 1). As is the case with Arithmetic, its 
specificity is quite high, accounting for an av- 
erage of 20% of its variance (3, Table 7). 
Further, its reliability is relatively low (8, p. 
13). These considerations make it difficult to 
attribute to Memory any given score on Digit 
Span. As will be noted later, however, an esti- 
mate of an S’s memory ability can be devised 
by combining this test with Arithmetic. 

Digit Span continues to be the poorest 
measure of G for the aged group, with an r of 
48 (see Table 2). It measures the Memory 
factor purely here, too, and to almost the 
same degree as it does G (.44). This test can 
therefore be used as a measure of Memory, 
which, in the light of the findings, becomes 
particularly important for aged Ss. 


Vocabulary 


The results of the present analysis again 
validate the high regard in which this test is 
held in the testing field, at least up through 
middle age. It is unexcelled both as a measure 
of G (.83), and in its loading in the common- 
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specific factor Verbal Comprehension (.39). 
Thus, if a single subtest is needed to measure 
present general intellectual functioning via a 
verbal avenue, Vocabulary should be the sub- 
test of choice for younger Ss. 

Among the aged, the picture changes mate- 
rially. Firstly, Vocabulary becomes a much 
poorer measure of G (.66), falling at the 
median of the subtests in this regard (see 
Table 2). 

A further difference in Vocabulary’s meas- 
urement functions with the aged is the by now 
familiar incursion of Memory variance. Fully 
25% of its variance (an r of .50) is associated 
with Factor C, far more than the 14% (anr 
of .37) attributable to Verbal Comprehension 
(see Table 2). This is the same problem al- 
ready encountered with Information Compre- 
hension, but raised to an even greater degree. 
Poor scores on this test are actually more 
likely to reflect poor memory ability (conse- 
quent upon senescent deterioration) than poor 
verbal ability per se. From tne score on Vo- 
cabulary alone, however, the clinician cannot 
make an unambiguous interpretation. 

Vocabulary emerges as a good measure of 
G and Verbal Comprehension until old age, 
when it becomes a mediocre measure of G, 
and an ambiguous measure of both the Mem- 
ory and Verbal Comprehension factors. 


Digit Symbol 


With the Digit Symbol test we have our 
first encounter with one of the minor specific 
factors, Factor E. For younger Ss, it measures 
only this among the common factors and its 
correlation with G (.65) is among the poorest 
(see Table 1). Since only this test measures 
Factor E consistently, a positive interpreta- 
tion of this factor is not possible—all that can 
be said is that it is not a measure of per- 
ceptual organization, or of memory (or, of 
course, of verbal ability). 

The specificity of Digit Symbol cannot be 
stated with certainty, but is probably very 
high. The only reliability information avail- 
able is a range-corrected alternate form co- 
efficient for a group of female nursing school 
applicants (8, pp. 12-13). If the obtained 
value of .92 is used as an estimate of the re- 
liability coefficient for our younger groups, the 
estimated average specificity which results is 
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37%. Thus, specificity accounts for almost as 
much of the subtest’s variance as G (42%), 
and much more variance than E (8%). 

In the aged group, as has been noted, Fac- 
tor E failed to appear. Beyond G measure- 
ment, which is relatively high (.71), the Digit 
Symbol test is significantly but weakly related 
to Perceptual Organization (.26) and the 
pervasive Memory factor (.21). In addition, 
using the same estimated reliability coefficient 
as for the younger group, specificity accounts 
for some 28% of the variance, a large pro- 
portion. 

Because of the estimated large specificity, 
its small loading on an uninterpretable minor 
factor in the younger groups, and its com- 
plexity in the aged, scores from this test re- 
sist interpretation as to common factor sig- 
nificance. Although Digit Symbol is a fairly 
good measure of G for aged Ss, there are bet- 
ter subtests for this purpose in the battery 
(Information and Picture Completion, see 
Table 2). 


Picture Completion 


The average correlation with G for Picture 
Completion in the younger groups is .75, the 
highest value yielded among the performance 
subtests in the younger groups (see Table 1). 
It also measures the other specific minor fac- 
tor D with 10% of its variance, and no other 
factor. An additional 15% of its variance is 
taken up in specificity. Since Factor D is un- 
interpretable, it has no utility for common 
factor measurement, but in situations where 
a nonverbal measure of G from a single sub- 
test is needed, the evidence of the present re- 
search is that it is the best choice. 

For the 60-over 75 group, this subtest has 
the same G correlation (.75), but here it is the 
best single subtest for measuring G in the en- 
tire battery (see Table 2). As is true in the 
younger groups, it measures Factor D purely, 
but weakly. 

The special utility of Picture Completion 
lies in its G measurement, which is best in the 
battery for aged Ss and best among the per- 
formance subtests for younger Ss. 

In previous factor-analytic research with the 
Wechsler-Bellevue (2), it was reported that 
Picture Completion was a complex test, meas- 
uring both Factors A and B for patients (1, 


pp. 275-276). This seems now to be an error, 
a consequence of failure to extract enough 
factors, due in turn to a low N. In extracting 
two additional factors in (3), the ambiguity 
due to the test’s apparent complexity in known 
factors was removed, unfortunately only to be 
replaced by ambiguity resulting from a load- 
ing in an unknown factor. A parallel situation 
exists for the Digit Symbol test (1, pp. 276— 
277). 


Block Design 


The measurement charcteristics of Block 
Design are quite straight-forward and similar 
for younger and aged patients. It falls at 
about the median with regard to strength of 
correlation with G, with values of .70 for the 
younger three groups and .65 for the aged 
group (see Tables 1 and 2). It is the first 
subtest thus far encountered whose common 
factor variance is completely in the Percep- 
tual Organization factor. The correlations with 
this factor, moreover, are relatively substan- 
tial: .41 is the mean value for the younger 
groups (Table 1) and .53 for the aged group 
(Table 2). 

The specificity value averages 17%, which 
is above the median of the subtests, but not 
objectionably large (3, Table 7). 

The Block Design subtest, then, is useful 
for the purpose of measuring Perceptual Or- 
ganization over the entire age range. 


Picture Arrangement 


For the younger Ss, this subtest is at about 
the median in its correlation with G (.70). 
Beyond its G measurement, although it cor- 
relates only with Perceptual Organization, it 
does so very weakly, with an average value of 
only .20 (see Table 1). This amounts to only 
4% of its variance, a negligible amount. This 
subtest’s specificity averages only 8% (3, 
Table 7), but much of its variance is taken 
up in error, its reliability for the younger 
three groups being .66, .60, and .74 respec- 
tively (8, p. 13). 

The picture becomes even worse when one 
turns to the aged subjects (Table 2). Its G 
measurement is still median, but poorer in 
absolute terms (.65). With regard to common 
factor measurement, the correlation with Fac- 
tor B is even lower (.18), and it correlates 
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.30 with the Verbal Comprehension factor, at 
about the same level as Information does. 
The factorial complexity of Picture Arrange- 
ment in these older Ss raises the concomitant 
problem of ambiguity in the interpretation of 
its scores. 

To sum up the case of Picture Arrangement, 
it is a mediocre measure o; G taken by itse’’, 
and a very weak or ambiguous measure of the 
common factors. 


Object Assembly 


The final subtest of the WAIS battery, Ob- 
ject Assembly, correlates with G .64 in the 
younger groups and .58 in the aged group, in 
each second only to Digit Span in being the 
poorest subtest in this regard (Tables 1 and 
2). As was the case with Block Design, how- 
ever, it is a fairly good measure of the com- 
mon-specific factor of Perceptual Organization 
(.46 in the younger groups and .51 in the 
aged group), and measures no other factor. 
It is slightly superior to Block Design as a 
measure of this factor in the younger three 
groups (Table 1) and slightly inferior in the 
oldest group (Table 2). 

Its specific variance is lowest among the 
subtests, averaging 5% (3, Table 7). This 
apparent advantage is of iittle help in the 
evaluation of a given score, since its error 
variance is next to the largest among the sub- 
tests, averaging 32% {computed from relia- 
bility coefficients in 8, p. 13). 

Despite this, this test can be used effec- 
tively to measure Perceptual Organization, 
particularly when used as described below. By 
itself, it is a poor choice to measure G, and is 
about equal to Block Design as a measure of 
Factor B. 


Factor Scores 


The conventional scoring of Wechsler scales 
involves the determination, by appropriate 
combinations of subtests, of Verbal, Perform- 
ance, and Full Scale IQs. The findings from 
the factor analysis of the WAIS indicate that 
these groupings do not constitute the actual 
functional unities in intelligence test perform- 
ance. Thus, the Verbal IQ reflects individual 
differences not only in Verbal Comprehension, 
but also in Memory; the Performance IQ re- 


sponds to Perceptual Organization but also to 
the minor specific factors D and E. 

It is possible, on the basis of the above 
analysis, to replace the a priori verbal- 
performance subtest groupings by the func- 
tional unities actually in operation. It is sug- 
gested that scores for the three major factors 
described below may be found useful in clini- 
cal, educational, and vocational research and, 
subsequently, appraisal. Their estimation by 
averaging appropriate groups of subtests 
rather than from single tests results in con- 
siderably increased reliability. 

In the material which follows, it is neces- 
sary to convert raw scores to weighted scores 
for the purpose of achieving subtest compara- 
bility. However, the weighted scores required 
are not those normally found for the purposes 
of IQ computation, where the reference group 
was made up of standardization groups be- 
tween the ages of 20 and 34. Instead, the 
weighted score conversion tables determined 
separately for each age group as given in the 
appendix in the manual are required (8, pp. 
99-110). Since the findings of the study are 
based on correlations of groups homogeneous 
with regard to age, it is necessary that age- 
corrected weighted scores be utilized. 

Since all the subtests are correlated at least 
appreciably with G, the G factor score is 
found by simply averaging the weighted scores 
of all the subtests (or all which have been 
given). A score of 10 on this (and all other) 
factor scores indicates average functioning at 
that age level. If only a measure of G is 
needed, the Full Scale IQ, of course, serves 
this purpose. The mean subtest weighted 
score, however, provides a reference point 
comparable with the other factor scores. 

Verbal Comprehension factor scores, for all 
but the oldest patients, are gotten by averag- 
ing Information, Comprehension, Similarities, 
and Vocabulary weighted scores. Verbal Com- 
prehension ability thus measured is “high” or 
“low” relative to the population as it is above 
or below 10, and “high” or “low” ipsatively 
(i.e., within the pattern of the S’s abilities), 
as it departs from the S’s G factor score. These 
relationships hold for all the factor scores. 

The approach given in the preceding para- 
graph cannot be used in aged patients, since, 
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as has been noted, the resulting score would 
reflect almost as much Memory as Verbal 
Comprehension variance. Instead, Verbal 
Comprehension factor scores for patients past 
60 are gotten by averaging the weighted scores 
for Similarities and Picture Arrangement, 
these being the only tests loading this factor 
which load no other (besides G). 

Perceptual Organization factor scores are 
obtained by averaging the Block Design and 
Object Assembly weighted scores for both 
younger and older Ss. 

The Memory factor is also measured in the 
same way over the entire age range, namely, 
by averaging weighted scores for Arithmetic 
and Digit Span. 

A possible approach to the measurement of 
age deterioration is suggested by the incursion 
of the Memory factor into the verbal sphere. 
If an older S’s Memory factor score is much 
lower than his Verbal Comprehension factor 
score, and his average score on the tests which 
share variance from both these factors, Infor- 
mation, Comprehension, and Vocabulary, is 
close to his reduced Memory factor score, this 
suggests that these verbal subtest scores have 
been reduced by failing memory ability, and 
a judgment of age deterioration would be 
made. The meaning of “much lower” and 
“close to” can only be specified by critical 
values determined by research. 


Summary 
A rationale for the subtests of the WAIS 
derived from factor analyses of this test for 
a wide age range (3), has been presented. 
Each subtest’s measurement function, in terms 


of a dominant general factor (G), three major 
common factors (Verbal Comprehension, Per- 
ceptual Organization, Memory), and two minor 
factors was presented. The results in the three 
younger groups (18-19, 25-34, 45-54) were 
found to be quite similar, but some subtests 
undergo a change in measurement function in 
the oldest group (60-over 75). Specificities 
are again (as in 1) not found high enough to 
warrant unique interpretations of the subtests. 
Methods were presented for the determination 
of factor scores which may prove useful in the 
appraisal of intelligence. 


Received February 2, 1957. 
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Intellectual Ability and Mode of Perception’ 


Douglas N. Jackson 


Pennsylvania State University 


Witkin and his colleagues (5) have re- 
ported intercorrelations among several labora- 
tory measures of orientation to the upright, 
and have shown that the latter correlate 
rather highly with a specially constructed 
embedded-figures test (4), in which S is in- 
structed to locate a simple figure contained 
in a more complex, colored pattern. The au- 
thors also present a number of significant 
relationships between scores based on the ori- 
entation tasks and on the embedded-figures 
test and scores derived from a variety of clini- 
cal assessment devices. 

Because the personality characteristics of 
visual field-independent and field-dependent 
groups are described in evaluative terms (the 
field-independent individual is described as 
more active, self-aware, self-assured, and gen- 
erally “mature’’), it was deemed advisable to 
appraise the effects of intelligence in a study 
of relationships among perceptual and person- 
ality variables (2). 

The Witkin embedded-figures test (4) was 
administered to 43 undergraduate students, 
18 men and 25 women, on whom American 
Council on Education Psychological Exami- 
nation (ACE) scores were also available. The 
product-moment correlation between the two 
measures was — .53 (p < .001) for the total 
group. Thus, Ss requiring longer times to ex- 
tract embedded figures tended to have lower 
ACE scores. For males the correlation was 
— .57, and for females, — .48. These correla- 


1An extended report of this study may be ob- 
tained without charge from Douglas N. Jackson, De- 
partment of Psychology, Pennsylvania State Univer., 
University Park, Pa., or for a fee from the American 
Documentation Institute. Order Document No. 5361, 
remitting $1.25 for microfilm or $1.25 for photo- 
copies. 


tions may have been attenuated by the re- 
stricted range of intelligence in the college 
population. 

This relationship between intelligence and 
perception in young adults parallels prelimi- 
nary data obtained with children in Witkin’s 
Laboratory (5, p. 478). The present results 
are not surprising, considering Thurstone’s 
(3) findings, and should add a note of cau- 
tion to attempts to interpret relationships be- 
tween individual differences in perception and 
personality without adequate controls for in- 
tellectual differences. 

However, in a broader sense, the historical 
priority of the concept of intelligence need not 
imply its conceptual preeminence; it may be 
fruitful to consider intellectual abilities and 
perceptual mode as manifestations of more 
general dimensions of personality style. 
Clearly, there is a need for systematic study 
of the various intellectual abilities (1) in re- 
lation to mode of perception. 


Brief Report. 
Received July 15, 1957. 
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An Evaluation of Eclectically Oriented Psychotherapy 


Frederick C. Thorne 


Brandon, Vermont 


The purpose of this study was to evaluate 
the results of eclectically oriented psychother- 
apy utilizing a wide variety of methods (3) 
specifically according to indications of time 
and place. The most rigorous test of the effi- 
cacy of any therapy can be made only by 
using cases of established severity which have 
proved refractory to other treatment methods, 
and in this study the most chronic and malig- 
nant cases available were deliberately se- 
lected. Much previous research has been 
weakened by failure to establish the exact 
nature and severity of the pathological proc- 
esses presumed to be under treatment. This 
defect was controlled in the present study by 
use of severely maladjusted subjects inde- 
pendently studied and diagnosed before re- 
ferral for therapy, so that the nature and de- 
gree of pathogenicity was objectively estab- 
lished by other specialists. Another common 
research defect is the failure to control for 
spontaneous remissions and the undifferen- 
tiated effects of miscellaneous therapeutic 
factors operating in all treatment situations. 
Such factors have been partially controlled in 
this study by selecting chronic cases which 
have shown no improvement or even become 
worse during prior institutional or outpatient 
treatment so that their refractoriness to ther- 
apy had been demonstrated. Finally, mental 
status at the start and end of therapy was 
rated objectively using a Prognostic Index 
(PI) score (1) measuring the five factors of 
malignancy of symptoms, trend of disorder, 
chronicity, degree of social and economic in- 
capacitation, and subjective feelings or status. 


Description of Cases 


The following criteria were used for select- 
ing cases: (a) Each case was diagnosed as to 


nature and degree of disorder by an independ- 
ent expert, usually the referring specialist. (>) 
Only cases having an initial PI score above 15 
were selected, thus establishing the fact of 
markedly incapacitating disorder. (c) All 
cases showed definite refractoriness to therapy 
as evidenced by chronicity even though hav- 
ing been exposed to prior therapies ranging 
from hospitalization to psychoanalysis for 
long periods. (d) Only cases motivated to ac- 
cept therapy and cooperative enough to con- 
tinue for at least 10 one-hour interviews and 
judged by the therapist as having sufficient 
personality resources to offer some hope for 
reorganization of personality integration in 
depth were used. 

The Ss consisted of the first 50 cases from 
the files in the years 1953-1956 which met 
these selection criteria. There were 20 males 
and 30 females, mean age 32.86 years (range, 
16 to 63 years), with average education of 
13.08 years (range, 8 to 20 years). Twenty- 
nine were treated while guests at Spring Lake 
Ranch, Cuttingsville, Vt., a psychiatric half- 
way house accepting referrals only from psy- 
chiatrists. Most of them either had been in 
psychiatric sanitaria or were on the borderline 
of institutionalization. All were ambulatory 
cases with behavior disorders or eccentricities 
so severe that they could not remain at home 
or in the community; many were considered 
unimproved even after years of treatment 
elsewhere. Twenty-one cases were referred by 
psychiatrists or physicians; all previously re- 
ceived treatment ranging from psychotherapy 
to tranquilizing drugs. Twenty-five Ss had 
been in psychiatric hospitals for periods rang- 
ing from one to 32 months (mean 12.5 
months), and 9 more had outpatient therapy 
lasting from six months to 21 months (mean 
15.6 months). Methods of treatment had in- 
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cluded psychotherapy, psychoanalysis, electro- 
shock, insulin coma, and chemotherapy. 
Table 1 presents the pretherapy psychiatric 
diagnoses made by referring specialists. Many 
cases presented mixed symptomatology, and 
several had been given different diagnoses re- 
flecting varying status. The final posttherapy 
diagnosis differed from the initial diagnosis in 
7 cases, in 5 of which there was a differential 
diagnostic question whether alcoholism with 
psychopathic character disorders was primary 
or secondary to complicating psychoneurotic 
reactions or prepsychotic state. During psy- 
chotherapy, the diagnostic process was ampli- 
fied constantly by the addition of dynamic 
insights upon which the moment-to-moment 
selection of treatment methods was based. 


Rating Scales 


The Prognostic Index (PI) was used to 
rate the mental status of each S at the start 
and end of therapy, using 5 five-point scales 
evaluating degree of malignancy from minimal 
to severe from easily obtained and verifiable 
facts. Psychiatric diagnosis concerning the 


Table 1 


Diagnostic Classification of Case Materials by 
Independent Experts 


Category N 


Schizophrenic reactions 
Paranoid type 
Catatonic type 


Manic-depressive reactions 


no NU) 


Involutional states 


Chronic brain syndromes 
With convulsive disorder 
Postencephalitic 


Psychoneurotic reactions 
Immaturity-dependency 
Anxiety hysteria 
Anxiety tension states 
Depressive reactions 
Character disorder with alcoholism 
Alcoholism with other addictions 
Psychosomatic disorders 
Mixed types, prepsychotic 


Homosexual with alcoholism 


Table 2 
[ Sources of Rating Data for Prognostic Index Scores 


Rating Scales Pretherapy Posttherapy 


Malignancy of Referring specialist, Therapist, staff, 
symptoms usually consultants 
psychiatrist 
Trend of dis- Case record and 
order other informants 


Chronicity (Pre) Case record and 
and Prognosis _ other informants 
(Post) 


Therapist, staff 


Therapist 


Incapacitation Case record and Staff, friends, 
other informants relatives, 
employers 
Subjective Subjects Subjects 
status 


malignancy of symptoms ranging from simple 
behavior disorder to severe psychotic reac- 
tions was used to establish degree of patho- 
genicity on Scale 1. The rating on Scale 2 on 
the trend of the disorder gave evidence con- 
cerning improving, unchanging, or progressive 
status. Chronicity was rated objectively in 
Scale 3 from the case history. Social and eco- 
nomic incapacitation as defined in the Vet- 
erans Administration psychiatric examination 
standards was rated in Scale 4. Subjective 
status was determined by the Ss’ reports con- 
cerning how they felt, usually with confirma- 
tory evidence from observations and staff re- 
ports, on Scale 5. 

The sources used to gather data for PI 
scores are shown in Table 2. The PI rating 
dimensions on the five scales are intended to 
sample all types of evidence, including refer- 
ring specialist, informants as to case history, 
therapist, staff, friends and relatives, and the 
Ss themselves. Given adequate case histories 
and staff reports which routinely covered the 
desired topics, PI ratings can be transcribed 
from case records by clerical assistants with 
final checking by the responsible therapist. 
Gradations on each scale rated from 1 to 5 
reflect increasing degrees of severity, with a 
summated score of 25 representing the most 
severe and malignant disorder, rapidly pro- 
gressive, of long chronicity, totally incapaci- 
tating, and subjectively unbearable. A PI 
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total score of 15 was arbitrarily selected as 
the cutting point below which cases were not 
accepted for this study, thus insuring that all 
cases were rated as at least moderately severe 
and more than 50% incapacitating. Scores be- 
tween 6 and 10 indicate minimal severity, per- 
haps annoying but not incapacitating, and a 
score of 5 (the lowest possible rating if all 
scales are scored) indicates essential normal- 
ity within the ordinary wear-and-tear or de- 
terioration of life. 

The range of PI scores of the 50 cases at 
the start of therapy was from 16 to 24 (mean 
20.8), indicating moderately severe to inca- 
pacitating disorder. Clinically, if these cases 
had been any worse, they could not have been 
treated at Spring Lake Ranch or on an out- 
patient basis. The mental status of each case 
was recorded in an index of five numbers, 
such as 54254, in which each figure succes- 
sively refers to the degree of severity on 
Scales 1-5, respectively. This index not only 
yields a total score but permits progressive 
comparisons of the five dimensions from rat- 
ings on different dates. 

All cases were rated again at the end of 
therapy with a modified PI in which the 
chronicity scale was replaced by an estimate 
of prognosis with ratings by the therapist of 
whether improved status would be main- 
tained. The PI ratings at the end of therapy, 
then, were made by the same method and 
formulas for reporting except that future 
chronicity was predicted instead of past 
chronicity on the third scale. The range of PI 
scores at the end of therapy was from 5 to 
23 (mean 11.13). 


Results 


Table 3 shows the distribution of PI scores 
at the start and end of therapy. Of the 50 
cases, all of whom were socially and economi- 
cally incapacitated at the start of therapy, 5 
were unchanged or worse in that both PI 
scores were above 16. Nineteen cases finished 
with scores between 11 and 15, still showing 
moderate disability with disturbing but bear- 
able complaints and marginal eccentricities; 
these Ss were considered as partially rehabili- 
tated with hope for some productivity under 
further supervision and therapy. Twenty-three 
cases scored from 6 to 10 on their final rating, 


with only minimal residuals and were con- 
sidered functionally normal. Three cases had 
final scores of 5 and considered themselves 
fully cured. 

A further check on the validity of terminal 
PI ratings was made in terms of a follow-up 
study of actual status as of February 1, 1957. 
Thirty-two cases are gainfully employed or, 
if women, returned to family life with ade- 
quate adjustment. Nine cases are not work- 
ing; 4 are in institutions, 1 about to leave on 
trial; and 1 each, whereabouts unknown, 
eloped, or dead. Also, as of this date, 33 con- 
sider themselves in need of no further inten- 
sive therapy, 11 continue to seek interviews 
intermittently, and 6 have withdrawn from 
therapy against advice. 


Dynamics of Therapeutic Results 


All of the diagnostic and therapeutic meth- 
ods outlined elsewhere (2, 3) were utilized at 
some point in the *2ndling of some of these 
cases. Depth modification of personality or- 
ganization was always the primary therapeu- 
tic objective while at the same time giving 
detailed attention to secondary objectives 
such as symptom relief. This approach is so 
complex that existing methods are incapable 
of objectifying and quantifying what takes 
place. Unfortunately, resources were not avail- 
able for pre- and posttherapy psychometric, 
sociometric, and projective analysis of enough 
cases to warrant detailed statistical treatment. 

The average PI rating for malignancy of 
symptoms on Scale 1 at the start of therapy 
was 4.34, and at the end, 2.40, on a five-point 


Table 3 


Distribution of Prognostic Index Scores at Start 
and End of Therapy (NV = 50) 


Score Intervals Pretherapy Posttherapy 

24-25 + 
22-23 14 3 

20-21 20 

18-19 10 
16-17 2 2 
14-15 4 
12-13 12 
10-11 7 
8-9 14 
6-7 5 
5 3 
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Table 4 


Therapist’s Evaluation of Success of Utilization of Eclectic Methods 


Evaluation 
Cases 
Type of Therapy Used Good Fair Poor 
Emotional reconditioning 50 32 11 7 
Expressing feelings, releasing repressions, catharsis, desensitization, 
accepting ambivalence, tolerating frustration, resolving conflict, 
learning emotional controls 
Habit training and symptom control 29 17 8 4 
Information giving, supportive and palliative therapy, environ- 
mental manipulations, detailed tutoring, task therapy 
Social retraining 19 6 9 4 
Relationship therapy, situational adjustment, adapting status re- 
quirements, role playing 
Rational therapy 23 11 5 7 
Intellectual and attitudinal reorientation, modification of value 
systems, and philosophy of life 
Depth analysis 22 13 4 5 
Psychoanalytic study and interpretation of unconscious needs and 
conflicts 
Self concept and life style 22 12 5 5 


Helping the person to perceive himself more realistically and to 


adopt a more effective offensive-defensive life strategy 


scale of severity. The mean PI rating of sub- 
jective status at the start was 4.62, and at the 
end, 2.20. These findings indicate that both 
objectively and subjectively, these Ss had rel- 
atively severe and painful symptoms at the 
start, and showed alleviation of both disorder 
and symptoms after therapy. Some evidence 
of the degree of functional disintegration pre- 
therapy is given by the mean PI social and 
economic incapacitation rating of 4.46, while 
after therapy the incapacitation rating had 
improved to 1.72. 

The therapist also made evaluative ratings 
concerning the therapeutic methods used in 
each case according to whether the results 
were considered good, fair, or poor, as shown 
in Table 4. Emotional factors were dealt with 
routinely in each case first, usually beginning 
nondirectively, and later more actively to un- 
cover repressions, resolve conflict, and inter- 
pret anxiety, ambivalence, and frustration- 
hostility-aggression reactions. Reassurance, 
palliation, support, and situational adjust- 


ments were attempted wherever indicated to 
make the S more comfortable and to facilitate 
deeper therapy. Re-educative methods de- 
signed to foster insight and to teach more 
effective controls were utilized at all stages 
and included information-giving, detailed tu- 
toring, assignment of tasks to facilitate a more 
active coping with life problems, and the re- 
jection of neurotic reactions wherever identi- 
fied. Social retraining involved relationship 
therapy, analysis of transferences, study of 
status requirements, role-playing, and how to 
handle situations. Rational therapy included 
intellectual reorientation attempting to induce 
more valid ideas, values, and attitudes within 
the ‘framework of the philosophy of science. 
Depth analysis stimulated the understanding 
of unconscious motivation, unblocking’and ac- 
cepting repressed impulses, and the interpre- 
tation of symptoms. The systematic rehabili- 
tation of self-concepts and life styles was 
attempted where indicated. As expected, all 
methods had successes and failures. 
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Case Reports 


The actual nature of the cases is perhaps 
best sampled by reviewing three condensed 
histories. It should be emphasized that it 
usually required many hours of going over 
the same problems to accomplish results. The 
average number of interviews for the whole 
group was 40.28, with a range from 9 inter- 
views in one case responding dramatically to 
brief psychotherapy as the etiologic factors 
were uncovered and dealt with quickly, to 
more than 132 interviews with the most 
difficult S (intensive treatment occurring in 
1953-1954) who had been seen intermittently 
since 1947. ' 


Case 1 


Male, aged 36, 12 years education, 132 
interviews 1953-1956. PI: start 55454; end, 
21311. 


Case history. Adopted son; uneasy and competi- 
tive with father, a successful business man; felt re- 
jected by foster mother, a cool distant woman who 
seemed to prefer other adopted child. Always held 
to high standards; early developed expensive tastes 
which never able to support on his own. First acute 
anxiety on European trip (graduation present from 
high school) when started drinking to relieve symp- 
toms. Between ages 20-30, progressive maladjust- 
ment; never able earn living; progressive drinking 
associated with severe character disorder manifested 
by sexual irregularities, family irresponsibility, two 
divorces, dishonesty, frauds, perpetual iusolvency, 
always bailed out by father. After spending over 
$100,000 on unsuccessful rehabilitation attempts in- 
cluding 2 years psychoanalysis in prominent sani- 
tarium, father developed manic-depressive psychosis 
and refused tc see patient again during his life. 

Therapy. Initial symptomatic and supportive ther- 
apy attempted to reduce anxiety, paranoid defensive- 
ness, and acute alcoholic problems. Total push pro- 
gram instituted to deal with problem on all levels; 
therapist took over financial control, negotiated with 
father’s representative for adequate allowance and 
living quarters, budgeted debt reduction, arranged 
home purchase, and devised employment. Deep ther- 
apy attempted reorganization of self concepts and 
life style. Psychoanalytic interpretation of anxiety, 
frustration, aggression, paranoid attitudes, and neu- 
rotic mechanisms. After gaining insight and more 
realistic self concepts, assigned tasks of rejecting neu- 
rotic gains relentlessly. Accepted the most active di- 
rection cooperatively (reinforced by our control of 
money) but many false starts and backslidings. By 
1957, no intoxication for 2 years, economically sol- 
vent, socially at ease, doing responsible welfare work. 
Our most difficult case, a basic reorganization of an 
initially psychopathic personality achieved. 


Case 2 


Male, 19, 12 years education, 87 interviews 
1955-1956. PI: start, 44455; end 32223. 


Case history. Came to America with refugee par- 
ents. It is stated that mother (a professional woman) 
had premonitions before his birth that he would be 
defective; psychiatrist later described her as the 
most rejecting mother ever observed. She claims he 
always made her nervous, she couldn’t work with 
him around, and to this date she won’t permit him 
to live in the same city. He became very anxious, 
insecure, hostile, aggressive, and destructive as a 
small child, developing severe temper tantrums and 
hysterical reactions which so interfered with parents’ 
health and work they kept him away from home 
after age 5 under psychiatric care continuously, for 
the last several years living in psychiatric hospital 
under analytic treatment. In 1955 this treatment was 
judged to have reached an impasse and he was sent 
away from hospital for a rest and not to receive 
therapy. He applied for treatment on his own. 

Therapy. Acutely anxious, frequent panic reactions 
at start. Uneasy, insecure, suspicious, hostile, unable 
to sit with girls or to relate to anyone except analyst, 
picking nose constantly, many somatic complaints, 
unable to work or relax at play, hateful towards par- 
ents. Resentful over being withdrawn from analysis 
and openly critical of us, expressed fears that he 
couldn’t be treated actively or directively. Allowed 
to ventilate anxieties and resentments nondirectively 
at start. Gradual interpretation of immature depend- 
ency reactions with insistance that neurotic gains be 
denied. Progressive conditioning of behavior by 
graded tasks involving increasing independence in the 
face of anxiety. Many regressions; expressed resent- 
ment against mother by killing chickens and punch- 
ing holes in wall. Taught that he must live even 
though maternal rejection continues. Self concept of 
himself as neurotic invalid challenged; interest 
aroused in his rational abilities and taught to handle 
anxieties actively. Now completing first year college 
away from home, shy but friendly towards girls, can 
now face mother on short visits even though she re- 
fuses to pay expenses if he insists on staying home. 


Case 3 


Female, 56, 13 years education, 76 inter- 
views 1953-1956. PI: start, 54555; end 422i1. 


Case history. Normal childhood. Encephalitis in 
2nd year college; residual Parkinsonism with general- 
ized rigidity, mask-like facies, organic memory and 
reasoning deficits, and recurrent oculogyric crises in 
which eyes roll up in head for minutes or hours in- 
hibiting all mental functioning. Gradual economic- 
social inadaptibility; unable to hold jobs or relate 
socially. Progressive emotional instability leading to 
psychotic episode in which found wandering nude. 
Referred as last resort before permanent institution- 
alization. 

Therapy. Constant state of emotional agitation at 
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start, crying constantly, petulant, annoying others, 
and completely demoralized. Elsewhere considered 
unsuited for psychotherapy because of organic state. 
Treatment began nondirectively, repeated ventilation 
and catharsis necessary. Reassured by much personal 
attention, tutoring and assurance that she would not 
be sent away. Self concept altered to perceive herself 
more realistically, no longer trying to compete with 
former equals, but accepting new rove of retiring but 
useful elder citizen. Former untenable goals dropped 
with new philosophy of acceptance emphasizing emo- 
tional calm and stability. After several months of 
psychotherapy, oculogyric crises formerly occurring 
several times weekly now ceased; Parkinsonian symp- 
toms improved, and mental functioning better. Now 
a healthy respected group member. Case cited to 
indicate how remaining personality resources may be 
maximized in organic conditions. 
Discussion 

Intensive psychotherapy involves the most 
difficult kind of work, both by therapist and 
client, to uncover and recondition deep patho- 
logical processes. In our experience, one of the 
commonest therapeutic errors is to abandon 
methods before they have had sufficient op- 
portunity to achieve results. Table 3 suggests 
that all methods have their successes and fail- 
ures, and it is our practice to keep working 
and trying different approaches until some 
leverage is achieved. In our opinion, even such 
older, largely discredited, methods as persua- 
sion and suggestion may be effective if pur- 
sued diligently enough. In this series, clients 
routinely were given maximum opportunity to 
solve problems nondirectively, with more ac- 
tive directive methods being introduced only 
where the client was unable to progress alone. 
In most cases, the therapist functioned as a 
co-worker, helping the client gain insights, 
working patiently together to practice new 
modes of behavior, and consistently attempt- 
ing to operate in a friendly informal atmos- 
phere with most sessions having the quality 
of friendly conversations or bull sessions. Cli- 
ents were always told not to accept interpre- 
tations unless they fit, to criticize freely, and 
to expect that progress of therapy would not 
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be steady at all times. Above all, they were 
taught to remain in the situation and face it, 
not giving up but to continue coping with be- 
havior even through painful periods when they 
were frankly told that they might become 
worse before new gains could be consolidated. 


Summary 


A group of 50 cases presenting severe be- 
havior disorders which had proven refractory 
to other therapies was selected by rigorous 
criteria for evaluation of the effects of inten- 
sive eclectically oriented psychotherapy. All 
cases were diagnosed by independent special- 
ists prior to therapy, manifested moderately 
severe to incapacitating disorder, had not im- 
proved sufficiently to become adjusted in the 
community under prior treatment ranging 
from hospitalization to psychoanalysis, and 
were cooperative enough and with sufficient 
resources to participate in deep psychother- 
apy. Mental status before and after therapy 
was objectified with Prognostic Index scores 
and other criteria of adjustment. Although all 
cases were socially and economically incapaci- 
tated at the start, 6% considered themselves 
as totally cured after therapy, 46% were rated 
as functionally cured with only minimal re- 
siduals, 38% showed marginal rehabilitation 
with some reduction of symptoms and in- 
capacitation, and 10% were unchanged or 
worse. It is concluded that eclectically ori- 
ented psychotherapy is capable of improving 
personality integration at both symptomatic 
and depth levels in selected severe cases. 
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The Effect of Varying Degrees of Projection 
on Test Scores’ 


Edward J. Wallon* and Wilse B. Webb 
U. S. Naval School of Aviation Medicine 


Disputes about the relative merits of objec- 
tive and projective personality measures have 
long been a part of the area of psychological 
assessment. The validity of objective person- 
ality tests has been repeatedly challenged (4), 
and the dissatisfaction and general lack of 
confidence of psychologists in these measures 
have been expressed frequently (9). Projec- 
tive devices, on the other hand, have often 
been subject to criticism because of their lack 
of standardization, norms, and scoring objec- 
tivity, together with the inevitable accompani- 
ment of low reliability. 

One direction that efforts to resolve these 
issues has taken is represented by various 
attempts to “objectify” projective techniques. 
These endeavors have been concerned, in the 
main, either with the development of objec- 
tive, quantitative scoring systems (1, 2, 3, 10, 
12) to be applied to the “free” responses ob- 
tained from the administration of the original 
form of the projective test, or with altering 
the iorm of the original projective test by 
equipping the test cards or items with some 
type of multiple-choice or “yes-no” response 
from which the subject must select one of the 
alternatives provided, instead of 7'ving an in- 
ventive response in the usual manner (5, 6, 7, 
8). Both methods of objectification are di- 
rected toward fulfilling the need for what 
Thurstone (14) described as tests which are 
projective for the subject, objective for the 
scorer. The question arises, however, as to just 


1 Opinions or conclusions contained in this paper 
are those of the authors. They are not to be con- 
strued as necessarily reflecting the view or the en- 
dorsement of the Navy Department. A form of this 
paper was published as a research report of the 
Naval School of Aviation Medicine. 

2 Now at Purdue University. 


how much or what, if anything, is “lost” or 
“altered” in the process of converting a pro- 
jective technique into an objective equivalent. 
This paper is directed toward an examination 
of this question. 

It was proposed that given projective tests 
be administered according to the usual meth- 
ods, that corresponding objective forms of 
these tests be devised and administered, and 
a comparison made between the scoring pat- 
terns obtained for each form. It was further 
proposed that an intermediate “projective- 
objective” test procedure be introduced in 
which the subjects (Ss) first would respond to 
the projective test in the normal manner fol- 
lowing which they would match their projec- 
tive productions with those of the multiple- 
choice alternatives constructed for the objec- 
tive form of the test. It was hypothesized that 
such a partial retention of the projective as- 
pect of the original test technique would in- 
crease the fidelity of the objective answers 
thereby obcained, permitting closer agreement 
with the responses commonly elicited by the 
projective form. 

The two projective tests selected for use in 
the investigation of these phenomena were the 
Rosenzweig Picture-Frustration Test and the 
Sentence Completion Test. 


Rosenzweig Picture-Frustration Test 


The Rosenzweig Picture-Frustration Test is 
a projective technique designed to measure re- 
actions to frustration. It consists of a series of 
24 cartoons depicting frustrating situations. In 
each situation, one of the persons is remark- 
ing on the frustrating occurrence, and a blank 
caption box is provided for the other person 
involved, into which the subject writes a reply 


465 


ae 
s 
n 
l- 
ir 
l- 
ic 
y- 
n- 
1g. 
4 
4 
ae 


466 


to the comment made. These responses are 
then scored as to the direction of aggression 
and type of reaction. The directions of aggres- 
sion are: (a) Extrapunitive-hostility is di- 


rected toward some person or thing in the, 


environment, (5) Intropunitive-hostility is di- 
rected toward the self, and (c) Impunitive- 
hostility is minimized or denied. These cate- 
gories are designated by the symbols E, 7, and 
M. The types of reactions are Obstacle-Domi- 
nant, Ego-Defensive, and Need-Persistive. The 
present study, however, was concerned only 
with investigating the direction of aggression. 
The final score is determined by summing the 
number of £, J, and M responses. Since each 
item receives a credit of one point, the total 
final score equals 24. When responses combine 
both intropunitive and extrapunitive aspects 
of aggression or a partly impunitive and partly 
aggressive reaction, the score for the given 
item is split, each aspect receiving a credit of 
0.5, although the total score for the item still 
remains 1. Occasionally, responses are encoun- 
tered which are too brief to be scored or which 
are the result of an inappropriate interpreta- 
tion of the situation. The number of such un- 
scorable responses, however, is small, only two 
per cent of the present sample falling into this 
group. 

An objective form of the test was devised 
in which the S$ received, along with the Ro- 
senzweig cartoon booklets, a series of multiple- 
choice responses to each situation. Thus, in- 
stead of writing his “free” xesponse into the 
caption box, the S was directed to select one 
of the three multiple-choice items for each 
situation. Each series of multiple-choice re- 
sponses contained one Extrapunitive, one In- 
tropunitive, and one Impunitive response se- 
lected from among typical scoring examples 
provided for each item in the Rosenzweig 
scoring manual (13). 

The projective form was given to 88 naval 
aviation cadets in their first week of preflight 
indoctrination. As soon as the Ss had com- 
pleted their projective protocols, they were 
presented with the multiple-choice objective 
form and instructed as follows: 


On this sheet of paper are listed 24 situations to 
which you have been asked to provide replies for 
one of the characters portrayed. Each statement de- 
scribing a particular situation is followed by three 
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possible replies. You are to compare the answers you 
have given to each situation in the booklet with the 
suggested responses below and indicate which of the 
three alternatives your answer most closely agrees 
with or resembles in meaning. In those cases where 
your response does not resemble any of the multiple- 
choice responses provided, you are to mark the al- 
ternative listed which is least unlike your response. 


In essence, the Ss were asked to score their 
responses as Extrapunitive, Intropunitive, or 
Impunitive by comparing and matching them 
with examples of each category. Two scores 
were then available from this administration: 
projective scores and projective-objective 
scores. 

A second administration using the objective 
multiple-choice version only was given to a 
second group of 71 cadets in preflight indoc- 
trination. 

The third administration of the test was 
presented under essentially a set to fake. A 
third group of 70 cadets was given the mul- 
tiple-choice Rosenzweig and instructed to se- 
lect the one answer to each item which they 
thought was “best,” that is, “the one which 
would put you in the best light as far as 
others are concerned.” In this way a measure 
of the social acceptability of the items was 
obtained. 


Results 


The projective form of the test was scored 
by the usual procedure outlined in the Ro- 
senzweig scoring manual. Since types of re- 
action were not included in this study, only 
the Z, J, and M scores were considered. The 
mean percentage scores of these categories 
were compared with the Rosenzweig norms 
(13). There were no significant differences in 
any category. 

Table 1 presents the means obtained on 
each of the three categories for each adminis- 
tration of the test. 


Table 1 
Mean E, J, and M Scores for All Test Forms 


Extra- 
punitive 


Im- 
punitive 


Intra- 


Forms punitive 


10.65 6.31 
6.00 
9.20 


9.64 


6.51 
11.57 
11.77 
13.54 


Projective 
Projective-Objective 6.36 
Objective 3.03 
Best Answer 83 


: 
= 


Table 2 


Table of ¢ Ratios Obtained Between the Various 
Methods of Presentation 


Projective- “Best” 
Objective Objective answers 


Extrapunitive: 
Projective 16.92* 17.72* 29.05* 
Projective-Objective 6.82* 14.11* 
Objective 6.25* 
Intropunitive : 
Projective 1.26 7.65* 9.85* 
Projective-Objective 7.96* 10.17* 
Objective 
Impunitive: 
Projective 20.65* 14.06* 20.56* 
Projective-Objective oH 4.52* 
Objective 4.28* 


* Significant at the .001 level. 


Table 2 indicates the significance of the 
differences among these means. 

Chi-square tests were performed among the 
E, I, and M scores obtained on the projective, 
projective-objective, and objective versions of 
the test on each of the 24 test items. Between 
the projective and projective-objective forms 
there were 13 significant item shifts away 
from E, 8 were toward the most socially ac- 
ceptable response, 5 toward the next most 
socially acceptable response. Of 21 significant 
shifts away from £ in emphasis between the 
projective and objective administrations, 13 
were toward the most acceptable response, and 
8 were toward the next most acceptable re- 
sponse. 


Discussion 


The results demonstrate that, with few ex- 
ceptions, both individual totals and item scores 
underwent consistent and significant altera- 
tions as the test was administered under each 
of the three separate conditions. As the 
amount of projective potential for a particular 
test form decreased, scores were yielded which 
more closely resembled the pattern of scores 
appearing on the faked version of the test. 

The results obtained with the projective- 
objective technique indicate that such a 
method yields scores which are significantly 
nearer to the scores obtained on the original 
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projective test than do purely objective trans- 
formations. On the projective-objective ad- 
ministration, two of the three categories under 
study in this experiment, Extrapunitiveness 
and Intropunitiveness showed mean scores 
which were significantly closer to the projec- 
tive mean scores than did the objective test. 
Thus, the effect of requiring the individual to 
score his own responses (by matching them 
with examples of the scoring categories) , once 
he has already committed himself on the pro- 
jective form of the test, serves to limit the 
tendency to select as many acceptable re- 
sponses as he would ordinarily if presented 
with the objective form only. Such projective- 
objective procedures, if adequately developed, 
might prove useful in bridging the gap be- 
tween projective and objective test methodol- 
ogy by permitting at least a partial retention 
of the desired features of both. 


Sentence Completion Test 


The second test employed was a sentence 
completion test in use at the U. S. Naval 
School of Aviation Medicine. An objective 
form of this sentence completion test, devel- 
oped earlier at the School of Aviation Medi- 
cine (7), was available for comparison with 
this projective form. The multiple-choice al- 
ternatives used on the objective form had 
been selected by first having a group of naval 
air cadets complete the projective form, cate- 
gorizing their responses according to content, 
and then differentiating them on the basis of 
indicated levels of personal and social adjust- 
ment within the cadet population. For each 
stem, three responses were finally selected: 
one indicated a favorable adjustment, one in- 
dicated lack of adjustment, and a third was 
“neutral.” Several examples are presented 
below: 


1. Compared to others: 
a. I’m probably average. 
b. In my group I sometimes feel I may fall short 
in some ways. 
c. Iam as apt to get my wings as anyone else here. 
2. I am a person: 
a. Who intends to strive hard in whatever I under- 
take. 
b. Who is attracted by many things -d finds dif- 
ficulty choosing among them. 
c. Who believes that friendship and fun are the 
most essential parts of living. 


B a ‘ 
S 
r 
r 
Tae 
e 
a 
is 
i- 
h 
as 
re 
as # 
od 
e- 
ly 
he 
es 
ns 
in 
on 
— 
ive 
i 
57 wy 
17 
4 


468 Edward J. Wallon and Wilse B. Webb 


The final form of the completed multiple- 
choice test (designated M-1) consisted of 47 
items of which only 32 were keyed for per- 
sonal social adequacy. The remaining 15 were 
buffer items relating to social attitudes and 
were not tabulated in the score. Every malad- 
justed response received a score of one point, 
each -nonmaladjusted response a score of 0. 
The possible range of scores was from 0 to 32. 

In the present study, a fourth choice was 
added to each set of multiple-choice items, a 
“None of these” category which gave the S the 
option of avoiding matching a response if he 
felt it was not comparable to any of the 
choices offered. 

In the first administration of the test, a 
group of 92 naval aviation cadets in their first 


_ week of preflight indoctrination was given the 


original projective form of the test and in- 
structed to “Finish each sentence in any way 


_you like as long as the completed sentences 


express what you actually feel or do.” Imme- 
diately following the completion of the projec- 
tive form, the multiple-choice form of the sen- 
tence completion test was given to the cadets 
with the following instructions: 


Below is the list of unfinished sentences which you 
have just completed. Each statement is followed by 
three possible ways of completing it. You are to 
compare your answers to each statement with the 
suggested answers below and indicate which of the 
three alternatives your answer most closely agrees 
with or resembles in meaning. If your answer in no 
way agrees with or resembles any of the suggested 
answers, indicate this by answering choice “D,” 
“None of these.” 


The second administration consisted of giv- 
ing a second group of 103 cadets the objective 
form of the test only. 

The third administration was given under 
a set to fake. The multiple-choice sentence 
completion test was given to a third group of 
70 cadets representative of the total sample 
under study. The group was asked to select 
the “worst” response for each item, that is, 
the one which “you think would make you 
appear mentally or emotionally disturbed or 
put you in the worst light as far as others are 
concerned.” In this manner, the order of ac- 
ceptability of the responses to the group could 
be determined. This group did not have the 
option of using the “None of these” category. 


For each of the 32 items, the multiple-choice 
alternative selected most frequently as the 
“worst” was used as the “maladjusted” re- 
sponse. A scoring key was devised, based on 
these cadet-selected “maladjusted” items. All 
three forms of the test were scored with this 
key. Each maladjusted response selected by 
the S received a score of “1”, each nonmalad- 
justed response received a score of “0”. Since 
there was no suitable quantitative method 
available for scoring the projective form of 
the M-1 sentence completion test, it was not 
possible to include this form in the analysis 
of results. 


Results 


It was found that on the projective-objec- 
tive form, 36.2 per cent of the total responses 
fell into the unscorable “None of these” cate- 
gory, while in the objective administration, 
only 13.0 per cent fell into that category. In 
view of this large disproportion in the use of 
scorable categories, a comparison could not be 
made between raw scores obtained. Instead, 
percentage scores were obtained for each indi- 
vidual, based on the number of maladjusted 
responses chosen relative to the number of 
scorable responses (other than “None of 
these’’). 

The means of the scores obtained on the 
three administrations included in this analysis 
are presented in Table 3. 

The significance of the differences was 
tested by the Mann-Whitney “U” test. All 
three of the scores were significantly different 
from each other at the .001 level of signifi- 
cance. 

Chi-square comparisons of the percentage 
of responses to each of the multiple-choice 
categories were performed between the pro- 
jective-objective and objective test versions 
for all 32 items. Fourteen of the items yielded 


Table 3 
Mean Percentage Scores for Sentence Completion Test 


Mean 

N Administration percentage 
70 “Worst” Response 60.18 
92 Projective-Objective 25.35 
103 Objective 20.67 
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significant chi-square coefficients. Of these 14 
items, seven consisted of significant shifts in 
emphasis on maladjusted responses in the 
projective-objective to less maladjusted re- 
sponses in the objective form; in six cases 
there were no significant shifts between the 
two versions on categories other than the mal- 
adjusted one; and in one case there was a 
shift from an adjusted response on the pro- 
jective-objective to a maladjusted one on the 
objective. 
Discussion 

More than one-third of the responses to a 
projective-objective form of a sentence com- 
pletion test were placed in a nonscorable cate- 
gory by persons taking the test. However, con- 
sidered on a percentage basis, the individuals 
selected a significantly higher proportion of 
maladjusted responses on the test form incor- 
porating a partial element of projection as 
compared to a purely objective adaptation. 
These results are viewed as substantiating the 
data of the Rosenzweig Picture-Frustration 
Test. In the present case, scores on two adap- 
tations of a projective test moved away from 
undesirable responses as the amount of pro- 
jective potential decreased for each version; 
convers:'y, in the previous experiment, scores 
obtained on two adaptations of a projective 
test moved toward desirable responses as the 


amount of projective potential for each form 
decreased. 


Self-Scoring versus Scoring by Others 


It was felt that a further approximation of 
the Rosenzweig projective-objective scores 
with the original projective scores might be 
achieved by having a group of cadets, other 
than the ones who took the test, do the match- 
ings with the multiple-choice responses of the 
objective form. It was also desired to deter- 
mine whether persons having specialized train- 
ing in psychology, but without training in the 
scoring of the Rosenzweig Picture-Frustration 
Test, could achieve a still closer approximation 
to the original projective scores than either of 
the two cadet groups. 

A comparable investigation was undertaken 
with the Sentence Completion Test. Cadets 
other than the ones who originally took the 
test were asked to match the responses which 


had been given on the projective form with 
the multiple-choice version. Matchings by 
psychologically trained persons were not pos- 
sible with this test because of limitations of 
personnel and time available. 


Procedure 


The projective responses of the 88 cadets 
taking the Rosenzweig Picture-Frustration 
Test were given, along with the multiple- 
choice form of the test, to a group of 88 peers 
(designated as the “Other Cadet” group) who 
were instructed as follows: 


Below are listed the 24 situations shown in the test 
booklets. Each statement describing a particular situ- 
ation is followed by three possible replies. In each 
case you are to match the answers which have al- 
ready been given in the test booklets with the mul- 
tiple-choice answers below and select the one which 
the answer most closely agrees with or resembles in 
meaning. In those cases in which the given answer 
does not agree at all with any of the responses below, 
select that response which is least unlike the answer 
in the booklet. Indicate your selection of answers by 
filling in between the dotted lines under the letters A, 
B, and C on your IBM sheet. 


The same 88 projective forms of the test 
were divided among a group of six research 
psychologists of the Aviation Psychology Lab- 
oratory. The training of the group members 
ranged from the Ph.D. degree to the Bachelor 
of Arts degree level. None of the psychologists 
in the group had training in the scoring of 
the Rosenzweig test. The group was given the 
same instructions as the “Other Cadet” group. 

The projective responses to the sentence 
completion tests given the group of 92 cadets 
in the second experiment were given, along 
with the objective form of the test, to a group 
of 92 peers who were given essentially the 
same instructions as above for matching of 
the responses. 


Results 


The mean E, J, and M scores for both the 
“Other Cadets” and for the psychologists are 
compared in Table 4 with the self scores 
(scores achieved from the matchings of pro- 
jective and objective responses as done by the 
original group taking the test), the objective, 
and best scores. 

The significance of the differences among 
the means obtained for the categories when 
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Table 4 


A Comparison of Mean Scores for Categories for All 
Administrations and Methods of Scoring 


Test forms and Extra- Intro- Im- 
scoring procedures punitive punitive punitive 
Projective scores 10.65 6.31 6.51 
Psychologists’ scores 8.52 3.84 11.45 
Other Cadets’ scores 7.33 5.44 11.15 
Self scores 6.36 6.00 11.57 
Objective scores 3.03 9.20 11.77 
Best Answer scores 83 9.64 13.54 


the matchings were done by psychologists and 
when they were performed by the two cadet 
groups is shown in Table 5. 

The Pearson r correlations between the 
groups for all three categories are shown in 
Table 6. 

On the sentence completion test, the mean 
percentage score for maladjusted responses 
was 29.13 for the “Other Cadet” group com- 
pared to 25.35 for the self scores. The signifi- 
cance of the differences between these two 
means was determined by the “sign” test (12). 
The value yielded was 2.20, significant just 
under the .02 level. 


Table 5 
Table of ¢ Ratios for E, 7, and M Among Scorers 
(N = 88) 
Other Psycholo- 
Self Cadet gists’ 
scores scores scores 
Extrapunitive: 
Projective scores use? 
Self scores 3.37* 8.15** 
Other Cadet scores 4.52** 
Intropunitive: 
Projective scores 1.26 2331 
Self scores 2.00 8.89** 
Other Cadet scores 7.02** 
Impunitive: 
Projective scores 20.65** 16.93** 16.65** 
Self scores 1.35 A4 
Other Cadet scores 88 


* Significant at .01. 
Significan' 


t at .001 
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Discussion 


The small but significant difference between 
the scores obtained from self-matchings by 
other cadets on the Rosenzweig Extrapunitive 
category is in the direction of agreement with 
the original projective extrapunitive mean 
score and suggests that a more accurate scor- 
ing may be obtained by scoring by others. It 
is suggested that this increased accuracy is 
due largely to the fact that the other cadets, 
being less personally involved in the match- 
ing, are not as prone to be defensive in making 
their comparisons. Counteracting this, how- 


Table 6 


Pearson r Correlations Between Scorers for 
E, I, and M Categories 


(N = 88) 
Other Psycholo- 
Self Cadet gists’ 
scores scores scores 
Extrapunitive: 
Projective scores 71 83 83 
Self scores .68 73 
Other Cadet scores ae 
Intropunitive: 
Projective scores .23 A2 A7 
Self scores .23 33 
Other Cadet scores 4 
Impunitive: 
Projective scores 63 56 57 
Self scores 57 72 
Other Cadet scores oe 


ever, is the fact that the individual who 
matches his own response had an advantage 
over the independent observer in that he has 
a greater knowledge of the true meaning of 
the statements. Such underlying nuances as 
sarcasm and irony are frequently undetectable 
to an outsider from the mere wording of the 
statement. 

The significant improvement of the psy- 
chologist group over the mean E scores of the 
“Other Cadet” group may be the result either 
of increased understanding of personality dy- 
namics, general familiarity with personality 
inventories, or practice effects (each psychol- 
ogist matched the responses of approximately 
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15 Ss as compared to one per S by the “Other 
Cadet” group). 

The M category shows considerable same- 
ness in mean scores, all three groups having 
almost identical results which very closely ap- 
proximate the scores obtained on the objective 
version of the test. All three groups placed a 
significantly larger number of responses into 
this category than actually occurred as indi- 
cated by the projective scores. The reason for 
this tendency is not clear, although it may 
represent a readiness on the part of all per- 
sons scoring the tests to view a response as 
Impunitive unless the indications of extra- 
punitiveness or intropunitiveness were strong 
or clear-cut. In other words, it might have 
been a catchall for responses not falling clearly 
into either of the other two categories. 

The table of correlations (Table 6) shows 
that there was considerable agreement among 
the scorers on the Extrapunitive category. 
Thus, the relative ranking of individuals, on 
this category at least, remained high even 
though absolute size of scores varied consid- 
erably among the scorers. Impunitive correla- 
tions showed moderate agreement and the 
Intropunitive showed the smallest relation- 
ship for the different scoring groups. 

The data for the sentence completion test 
likewise indicate that significant differences 
exist between the matchings of Ss themselves 
and those of other Ss in the direction of un- 
favorable scores, in this case “maladjustment” 
scores. Again a smaller tendency toward ego- 
defensiveness on the part of other Ss is sug- 
gested as a possible explanation of this dif- 
ference in scoring. 


Summary 


Numerous attempts have been made to de- 
vise objective versions of standard projective 
tests in order to overcome the limitations 
which each method suffers when used inde- 
pendently. An investigation was undertaken 
(a) to ascertain the extent and nature of the 
differences which such transformations incur, 
and (5) to test the hypothesis that a partial 
transformation of a projective technique into 
an objective form would yield scores which 
resembled those of the original projective test 


to a significantly greater degree than a fully 


objective version. A projective test of reac- 
tions to frustration and a sentence completion 
test were used. 

The results obtained indicate that signifi- 
cant differences occur between projective and 
objective versions of the same test in which 
the objective forms yield scores which ap- 
proximate “socially acceptable” responses to 
a considerably greater extent than do the pro- 
jective versions. It was also found that the 
scores derived from a partial objectification 
of the projective technique resembled “socially 
unacceptable” responses (extrapunitiveness 
and maladjusted), as obtained on the projec- 
tive version of the test, considerably more 
than fully objective adaptations. Among the 
possible explanations for these differences in 
scores the following were suggested: (a) The 
“fakability” of a test decreases with the 
amount of projective potential characteristic 
for a test form, the more projective forms 
being the less fakable; (b) the wording of the 
multiple-choice responses on the objective ver- 
sions may have led to a consistent rejection of 
socially unacceptable responses; (c) the pre- 
disposition to cheat may be less on a projec- 
tive form than on an objective form because 
the projective form does not require an indi- 
vidual to select inaccurate, restricted self de- 
scriptions. 

The use of a semi-projective technique is 
suggested as offering a partial solution to the 
need for tests which combine the desired fea- 
tures of the projective techniques with the re- 
liability and ease of scoring of objective meth- 
ods. Increased accuracy of scoring is also in- 
dicated in having subjects other than the ones 
taking the projective form of the test match 
the responses with the multiple-choice form 
of the test. 


Received February 8, 1957. 
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Thematic Apperception, Rorschach Content, and 
Ratings of Sexual Attractiveness of Women 


as Measures of the Sex Drive’ 


Seymour Epstein and Richard Smith 
University of Massachusetts 


The present study was undertaken to de- 
termine whether projective responses can dis- 
tinguish between different degrees of the sex 
drive when the latter is not artificially in- 
duced. In a study by Clark (4), it was dem- 
onstrated that sex responses on the TAT 
(Thematic Apperception Test) are influenced 
by the presentation of pictures of nude women 
before the experiment. However, although 
such a study has the advantage of a high 
degree of experimental control, it raises the 
question of whether a drive aroused by ex- 
ternal stimulation functions in the same man- 
ner as one produced predominantly by inter- 
nal stimulation. One consideration, in this re- 
spect, is that experiments utilizing external 
stimulation are apt to induce set effects that 
are confounded with drive state, although 
clever deception, as used by Clark, can reduce 
this possibility. A second consideration, not 
unrelated to the first, is that drives produced 
by external stimulation are more apt to be 
correctly labeled. Finally, a drive aroused by 
external stimulation is apt to be more acute 
than one produced by mounting internal stim- 
ulation. It is obviously the relatively enduring 
internally produced drive states that are of 
fundamental concern to the user of projective 
tests, who views transient states produced by 
recent experiences as sources of error. 

One means of obtaining different degrees of 
drive strength used in studies on hunger and 


1 This research was supported in part by a grant 
from the Research Council of the University of 
Massachusetts which provide. funds for the junior 


author who served as research assistant to the senior 
author. 


thirst (1, 3, 5, 7, 11, 12, 16, 18, 19) has been 
to require abstinence for fixed periods. How- 
ever, in any study where abstinence is re- 
quired, cues as to the nature of the study are 
unavoidably presented. The importance of this 
consideration has been demonstrated in two 
studies where it was found that instructional 
set effects exerted more of an influence upon 
drive-related responses than did the drive state 
itself (5, 18). Consequently, the only sound ex- 
perimental evidence for a relationship between 
the hunger drive and food-related responses 
stems from studies where testing was done be- 
fore and after normal meal time and no clues 
were provided about the nature of the study 
(5, 6, 10, 13, 15, 16). Considering that, un- 
like hunger, there are no regularly prescribed 
times for sexual graiification, the best that can 
be done to estirnate an individual's sex drive 
may be to obtain information about his sexual 
behavior. In the present study, two measures 
of drive strength were investigated, one based 
upon reported rate of sexual orgasm, the other 
upon last orgasm in relation to rate. 


Method 


The Ss were 59 male students enrolled in 
an introductory psychology course at the Uni- 
versity of Massachusetts. Testing was done in 
two separate sessions, one group of 29 receiv- 
ing the Group Rorschach Test before the 
TAT, the other group receiving the tests in 
reverse order. After testing was completed, 
a questionnaire on sexual behavior was filled 
out. Following the questionnaire, slides of sex- 
ually attractive women were presented, and S 
rated each for sex appeal. 


473 


uxt 
r 
* 
‘ 
| 
q 
fl 
ved 
| 


474 Seymour Epstein and Richard Smith 


Thematic Apperception 


In obtaining thematic apperception stories, 
the group method described by Atkinson and 
McClelland (1) was followed, with the excep- 
tion that four minutes rather than five were 
allowed for writing. A total of eight pictures 
was obtained from the standard TAT (14), 
the Symonds Picture-Story Test (17), and 
from magazines. Transparencies were made 
and projected on a screen, The pictures were 
the same as those used in a study on the hun- 
ger drive (6), as the plan originally called for 
a comparison of the hunger and sex drives 
holding everything constant but the type of 
drive.* For purposes of this study, it merely 
need be indicated that none of the pictures 
was high in stimulus-relevance and that a de- 
scription of them is available elsewhere (6). 
Two weighted scores parallel to the ones used 
in the study on hunger were investigated. 
These were Need sex and Ajppealingness of 
object. A third score, Sexual need of hero, 
which was more restricted than Need sex in 
that it considered only the hero’s need as di- 
rectly indicated, was dropped after it became 
apparent that it gave almost identical results 
to Need sex. Following is a description of the 
two scores investigated: 


Need sex. This score was adapted from the Mur- 
ray scoring system (14). It involves a global judg- 
ment of the need of the story-teller based upon the 
intensity of the hero’s need, importance of the need 
to the plot, frequency with which it is mentioned, 
and duration. Need sex is defined as the need “to seek 
and enjoy the company of the opposite sex. To have 
sexual relations. To fall in love, to get married” (14, 
p. 10). A basic weight of 1 was assigned to the 
slightest indication of romance (e.g., “They are man 
and wife”), of 2 when a direct reference to romance 
was made or when some secondary physical contact 
was indicated (eg., “They love each other.” “He 
would like to kiss her”), and of 3 when sexual inter- 
course was implied (e.g., “She has become pregnant, 
and they are wondering what to do”). The basic 
weight was then modified by taking into account 
centrality, frequency, and duration. 

Appealingness of sex object. The essence of this 
score is the determination of how desirable the par- 
ticular sex object described is. Unlike Need sex, this 


2This plan had to be abandoned when it was 
found in the hunger study that the degree to which 
the pictures were relevant to the drive in question 
was a critical variable. In order to make meaningful 
comparisons, it would be necessary to construct par- 
allel sets of pictures in regard to drive-related cues. 


score includes both negative and positive weights. A 
weight of — 3 was assigned when the woman was de- 
scribed as unappealing and rejected despite an indi- 
cation of need by the hero (eg., “He tells her she 
doesn’t appeal to him, that he is in love with some- 
one else”). A weight of — 2 was assigned when the 
woman was generally unappealing, but the sex need 
was directed toward her (eg., “She doesn’t appeal to 
him at all, but he has no one else so he takes her 
out”). A weight of — 1 was assigned when there was 
any slight indication that the sex object was not ap- 
pealing (eg., “She is O.K. but nothing exciting”). A 
weight of 0 was assigned when there was no way of 
evaluating the sex object (eg., “The man and woman 
are eating dinner’). A weight of +17 was assigned 
when the sex object could be assumed to be normally 
desirable (e.g., “They are kissing”). A weight of + 2 
was assigned when the sex object was described as 
definitely appealing (e.g., “He is looking at this at- 
tractive woman’), A weight of +3 was assigned 
when the sex partner was described as unusually ap- 
pealing (eg., “He is thinking how much he would 
like to go to bed with this magnificent creature’). 

In order to obtain the final scores on thematic ap- 
perception, both Es scored 10 records selected at ran- 
dom. Discrepancies were discussed and examples of 
scoring problems taken down as guides, These ten 
records were omitted from the computation of reli- 
ability figures. Following this preliminary procedure, 
both Es independently scored the remainder of the 
records. Finally, discrepancies were resolved by dis- 
cussion, the more conservative score being selected 
when agreement could not be reached. 


Rorschach Content 


The Group Rorschach Test was adminis- 
tered by the standard procedure (8), with the 
exception that S was instructed to indicate the 
location of each response at the time he re- 
corded it, and no additional inquiry was con- 
ducted. Following is a description of the scores 
investigated with examples of each identified 
by reference to Beck’s locations (2). 


Sex imagery. Any reference to sex, sex activity, 
sex anatomy, or *ex clothing. This score includes all 
other sex scores (e.g., Card II, W—“A symbolic rep- 
resentation of sex’’). 

Human sex imagery. Sex imagery which relates to 
humans rather than animals. It includes sex activity, 
sexual body parts, clothing, and abstract references 
to sex (eg., Card I, D3—“bottom half of a nude”). 

Animal sex imagery. Sex imagery which relates to 
animals (eg., Card IX, D3—“looks like a pair of 
moose making their mating calls”). 

Popular sex imagery. Any sex response produced 
more than once in the total sample (eg., Card VII, 
W—“two people about to kiss”). 

Sex object. A simple enumeration of a sex-related 
object or of a sexual body part (e.g., Card X, D6— 
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“a woman's bra”; Card Ill, Dd27—“a woman's 
breast’’). 

Sex activity. Any reference to sex-related activity 
(eg., Card VII, W—“two people trying to kiss each 
other”; Card VII, D4—“two bears touching or rub- 
bing their rear ends”). 


Questionnaire on Sex Drive 


The use of a questionnaire as a means of de- 
termining sexual drive has the weakness that 
it is affected by poor recall and conscious and 
unconscious falsification. However, there is no 
other practical way of obtaining information 
about rate of orgasm and time since last 
orgasm than by asking a person. By using 
safeguards, such as appealing to S’s scientific 
integrity, assuring anonymity, and requiring 
relatively specific information, it was hoped 
that falsification could be reduced to a mini- 
mum. Although absolute accuracy for each in- 
dividual could not be expected, it could more 
reasonably be assumed that a group that re- 
ported a high rate of sexual outlet had a higher 
mean rate than a group that reported a low 
rate, ie., that part of the variance was due to 
reality considerations. Of some importance in 
respect to the validity of the questionnaire was 
the finding that the median reported rate of 
2.19 was very close to Kinsey’s figure of 2.12 
for males of the same age and education (9, 
p. 336). Of course, the final evaluation of any 
measure depends upon whether it can be reli- 
ably related to other measures in a meaning- 
ful way. It will be seen that the findings in 
the present study can be most easily under- 
stood in terms of the assumption that the 
questionnaire measured what it was presumed 
to measure, although alternate interpretations 
are no doubt possible. In this regard, an im- 
portant consideration is that S had no clues 
about the nature of the study until after the 
first two projective tests. 

In order to decrease defensiveness, the fol- 
lowing statement was read before presenting 
the questionnaire: 


Kinsey has found that there are considerable in- 
dividual differences in frequency of sexual orgasm, 
but the rate for any one person is relatively con- 
stant. Among college students, the average rate per 
week varies from zero to thirty. There is no reason 
to believe that any one rate is more “normal” than 
any other. 

Try to be as accurate as possible in answering the 
following questions, as the results of the study de- 


pend upon your information. Remember, there is no 
way of identifying your paper, so you needn’t be 
concerned about revealing personal information. 

In the questions below, sexual orgasm refers to all 
kinds considered as a whole, regardless of whether it 
occurs from intercourse, masturbation, wet dreams, 
petting, etc. 


In order to insure anonymity, Ss were in- 
structed not to put their names on their pa- 
pers. The questionnaire contained three check- 
lists calling for (a) day of last orgasm, with 
a range from “today” to “more than 7 days 
ago,” (6) average rate of orgasm during the 
past 2 months, with a range from 0 to 22 or 
more times per week, and (c) subjective rat- 
ing of sex drive at the moment, with a range 
from “no sexual desire at all” to “intense 
sexual desire.” 


Ratings of Pictures 


After completing the questionnaire, Ss were 
requested to rate on a 5-point scale the sex 
appeal of three women whose pictures were 
successively projected on the screen. The first 
was of an attractive young woman in a bath- 
ing suit; the second was of a woman in a 
night gown who was lying seductively on a 
bed; the third was of a model wearing a low 
cut dress. The pictures were presented after 
the questionnaire in order to control for the 
possibility that arousal from the pictures 
would influence the ratings of sex drive. The 
reverse possibility was not considered as 
likely, and if it did occur would at least not 
affect the relationship of sex drive to the TAT 
and Rorschach. 


Analysis of Data 


Two possible measures of sex drive were 
considered, rate of orgasm and time since last 
orgasm relative to rate. Subjects were divided 
into equal thirds according to rate. Within 
each rate, a division as close to the median as 
possible was made for time since last orgasm. 
Excessive cases were randomly discarded, re- 
sulting in six cells of eight cases each. The 
lowest rate consisted of once or less per week, 
and was subdivided according to whether the 
last orgasm occurred less than four days ago; 
the middle rate consisted of two times per 
week, and was subdivided according to whether 
the last orgasm occurred less than three days 
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ago; the highest rate consisted of three or 
more times per week, and was subdivided ac- 
cording to whether the last orgasm occurred 
less than two days ago. All analyses of vari- 
ance involved two degrees of freedom for rate, 
one for satiation, two for rate X satiation, and 
42 for the error term. The data were inspected 
for skewness and for correlation of means and 
variances; in no case was the need for trans- 
formation indicated. 


Results 
Subjective Sex Ratings 


Scores consisted of ratings on the five-point 
check list of how sexually reactive S felt at 
the moment. The analysis of variance failed to 
reveal a relationship between either rate or 
satiation and self-ratings of sexuality, all F 
values being less than one. As might be ex- 
pected, the sex drive, in this respect, is differ- 
ent from the hunger drive, where high corre- 
lations between subjective and objective meas- 
ures have been reported (6). In regard to the 
remainder of the study, these findings indicate 
tLat any relationship found between objective 
and projective measures cannot be accounted 
for by a common relationship with subjective 
sexuality. 


Thematic Apperception Test 


Murray Need Sex. The range of scores was 
from 0 to 7, with a mean of 3.04. A Pearson 
product-moment correlation of .89 (N = 49) 
was found for interscorer reliability. Analysis 
of variance revealed that rate was significant 
at the .001 level (F = 8.31), and that no 
other source of variance approached signifi- 
cance. The mean scores for the three rates in 
ascending order were 1.12, 3.31, and 4.06, in- 
dicating a positive relationship between Mur- 
ray need sex scores and rate of orgasm. 

Appealingness of Sex Object. The range of 
scores was from — 5 to 7, with a mean of 2.35. 
The interscorer reliability coefficient was .84. 
Analysis of variance failed to reveal any source 
that approached significance. 

In a recent study on thematic apperception 
as a measure of physiological drive (6), there 
was some indication that negative scores pre- 
dicted in the same direction as positive scores 
and, therefore, should not be algebraically 


summed. Accordingly, a second analysis was 
performed disregarding algebraic sign. In this 
analysis, rate was significant at the .01 level 
(F = 5.47). The order of the means, in as- 
cending order, was 2.56, 3.34, and 4.81, indi- 
cating a positive relationship between A ppeal- 
ingness of sex object scores and rate of orgasm. 


Rorschach Content 


The Rorschach content scores could not be 
analyzed by analysis of variance as the inci- 
dence of scorable responses was too low. Con- 
sequently, separate chi-square analyses for 
rate and satiation were performed. For rate, 
a division as close to the median as possible 
resulted in a comparison of 35 Ss with a rate 
of two or less times per week and 24 with a 
rate of three or more times per week. For 
satiation, where Ss were divided as close to 
the median as possible for each rate, it was 
necessary to eliminate five Ss who fell exactly 
at a median point. 

The low incidence of occurrence in most 
categories precluded highly reliable discrimi- 
nation between groups. Following is the num- 
ber of Ss, among the total of 59, who pro- 
duced at least one response in a category: 
Sex imagery, 20; Human sex imagery, 14; 
Animal sex imagery, 6; Popular sex imagery, 
11; Sex object, 12; Sex activity, 10. 

Following are the percentages for all scores 
of the number of Ss, divided according to 
rate, who produced at least one response, the 
figure for low-rate being presented first: Sex 
imagery: 26% vs. 46%; Human sex imagery: 
11% vs. 42%; Animal sex imagery: 11% vs. 
8%; Popular sex imagery: 17% vs. 21%; Sex 
object: 14% vs. 29%; Sex activity: 11% vs. 
25%. Yate’s correction was used for all chi 
squares. The only score which significantly 
differentiated the groups was Human sex im- 
agery (.02 level). 

The division according to satiation did not 
result in significant differences on any score. 


Picture Ratings 


Scores were obtained by summing the rat- 
ings of sex appeal given to the three pictures. 
The range of scores was 3 to 15, with a mean 
of 9.77. Analysis of variance indicated that 
rate was significant at the .001 level (F = 
8.89), and that no other source of variance 
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approached significance. The mean scores for 
the three rates in ascending order were 9.13, 
9.81, and 10.69, indicating a positive rela- 
tionship between rating the women in the pic- 
tures as high in sexual appeal and rate of 
orgasm. 


Discussion 


The major finding in the present study was 
that in three separate measures, thematic ap- 
perception, Rorschach content, and ratings of 
the sexual appealingness of women whose pic- 
tures were shown, a direct relationship was 
found between sexual responses and reported 
rate of sexual orgasm. However, the Rorschach 
finding was somewhat questionable in view of 
the number of scores investigated. In this re- 
spect, it was evident that the Rorschach test 
cannot be a sensitive measure of individual 
differences in sex drive as it elicits too few 
relevant responses. If the present findings are 
verified, it would indicate that human-related 
sex responses on the Rorschach are more pre- 
dictive of total sexual outlet than animal-re- 
lated responses. 

The finding of a direct relationship between 
a measure of sex drive and sex responses to 
projective techniques is not in accord with a 
study reported by Clark (4). In Clark’s study, 
when testing was done under similar circum- 
stances to the present study, a drive-aroused 
group produced fewer sex responses than a 
control group; only when testing was done 
after a beer party was a positive relationship 
found. The most obvious difference between 
the two studies is the manner in which drive 
was determined. Clark induced the sex drive 
by showing Ss pictures of nude women and by 
using an alluring female examiner. In the pres- 
ent study#two noninduced measures of drive 
were investigated, rate of orgasm, and time 
since last orgasm relative to rate. Only the 
former was found to be related to projective 
responses. A possible explanation is that sex 
rate itself is determined by the degree to 
which the physical expression of the sex drive 
is acceptable to the individual, while Clark’s 
criterion of drive did not involve acceptance 
of the drive. In addition, as has already been 
indicated, a drive induced by external stimu- 
lation is more apt to be associated with in- 
hibitory reactions than one that is predomi- 


nantly inwardly determined as it tends to be 
more acute and is more readily labeled. (6). 

A difficulty inherent in most studies on drive 
strength is that the intensity of the drive must 
be inferred from antecedent conditions or con- 
current relationships which do not bear a di- 
rect relationship to the impulse. The present 
findings are particularly vulnerable on this 
point, and might most cautiously be inter- 
preted us simply having demonstrated a rela- 
tionship between projective responses and an 
external criterion. 

In regard to further work on projective re- 
sponses as measures of drive strength, anchor- 
ing the drive in a physiological state rather 
than relying upon verbal report would be an 
obvious improvement. Such an approach is 
currently being explored with women Ss by 
relating sex responses to phase of menstrual 
cycle. Another promising line of investigation 
is suggested by recent work which offers an 
approach to evaluating inhibitory and drive 
reactions as well as their interaction (6). 


Summary 


Fifty-nine college males were given three 
projective tests, a test of thematic appercep- 
tion, the Group Rorschach Test, and pictures 
of attractive women who were rated for sex 
appeal. In order to measure sex drive, a ques- 
tionnaire was anonymously filled out with in- 
formation on average rate of sexual orgasm, 
number of days since last orgasm, and sexual 
responsivity at the moment. Two measures of 
drive were investigated, one based solely on 
rate; the other on satiation, as determined by 
days since last orgasm, relative to rate. The 
major findings may be summarized as follows: 


1. Subjective judgment of sexuality was not 
significantly related to rate or satiation. 

2. Sexual response scores on all three pro- 
jective measures were directly and significantly 
associated with rate, but none was related to 
satiation. 

3. Rorschach sex content cannot be a sensi- 
tive measure of drive as too few such responses 
are elicited. In view of the number of Ror- 
schach comparisons made, the results on this 
test require verification. 

4. On a thematic apperception score, Ap- 
pealingness of sex object, responses describing 
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the object as unappealing, predicted in the 9. Kinsey, A. C., Pomeroy, W. B., & Martin, C. E. 
same direction as ones describing the object Sexual behavior in the human male. Phila- 
as appealing delphia: Saunders, 1948. 

, 10. Lazarus, R. S., Yousem, H., & Arenberg, A. 


Received January 29, 1957. Hunger and perception. J. Pers., 1953, 21, 312- 
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Unsuccessful Differential Diagnosis 
from the Rorschach’ 


Stewart G. Armitage and David Pearl 
VA Hospital, Battle Creek, Michigan 


Numerous articles appearing in the litera- 
ture have attempted to demonstrate a rela- 
tionship between Rorschach findings and vari- 
ous psychiatric diagnoses. Investigators report 
varying success in relating test characteristics 
to specific diagnostic categories. Some of these 
characteristics are the presence or absence of 
certain of the Rorschach determinants, their 
relative strengths, patterns, ratios, and their 
adherence to acceptable criteria. Other meth- 
ods have relied more heavily upon the content 
of the record and its characteristics, while still 
others have employed both content and deter- 
minants in various combinations. 

The attempted validation of these findings 
has in turn given rise to a still greater number 
of articles, among which there is considerable 
disagreement. Some have indicated that there 
is a relationship existent between certain Ror- 
schach patternings and psychiatric diagnoses, 
while others report findings seemingly dia- 
metrically opposed. This confusion has been 
variously attributed to differing psychiatric 
populations, varying methods of Rorschach 
administration and scoring, and the question- 
able use of the psychiatric diagnosis as a cri- 
teria. These difficulties are recognized and an 
attempt is being made in this study to avoid 
them. 

Many clinicians object to the use of the 
Rorschach primarily as a diagnostic tool. 
They point out that its most effective use lies 
in such areas as personality description, its 
prognostic value, and its indications for treat- 
ment possibilities. The diagnostic impression 
is considered to be a rather unimportant by- 
product. In numerous psychiatric hospitals the 


1 From the Veterans Administration Hospital, Bat- 
tle Creek, Michigan. 


diagnosis is still important, and despite these 
objections the Rorschach is frequently used in 
diagnostic determination. It becomes impor- 
tant then to specifically investigate the poten- 
tial of the Rorschach for such use, to ascertain 
those aspects of Rorschach utilization which 
contribute to successful diagnosis, and to de- 
termine whether or not these aspects vary 
with differing types of patients. 


Method 
Procedure 


All Rorschach records were obtained from 
patients referred to the Psychology Service for 
testing by members of the hospital admissions 
board. Patients who were too ill or whose clin- 
ical manifestations were sufficiently clear-cut 
so that a proper diagnosis could be made 
solely on that basis were not tested. For this 
reason, the sample does not represent a cross 
section of the hospital population. Rather, it 
refers directly to those types of problems 
which require diagnostic assistance. The psy- 
chologist’s diagnostic impression was derived 
from the Wechsler-Bellevue Intelligence Scale, 
Form I, the Rorschach test results, and what- 
ever cues he may have obtained from patient- 
examiner interaction during the testing ses- 
sions. All diaynoses based upon psychological 
test data were made prior to the final diagnos- 
tic staffings of patients and agreed in 80% of 
instances with the final staff diagnoses. Rec- 
ords were selected from a large pool of ap- 
proximately 1,000 cases which were collected 
over a five-year period. For this study, only 
such cases were retained which agreed com- 
pletely with independent impressions ob- 
tained by various psychiatrists, based on their 
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clinical observations and the history of pa- 
tients’ illnesses, and with the classification 
made at the final diagnostic staff. All records 
utilized were rescored in accordance with 
Beck’s (1) current scoring standards and were 
considered sufficiently extensive in terms of 
number of responses and detail of inquiry to 
permit an adequate evaluation. 

Two approaches were made to the analysis 
of data. The first attempted to determine a 
relationship between Rorschach determinants 
and specific psychiatric classifications such as 
paranoid schizophrenia, unclassified schizo- 
phrenia, neurosis, and character disorder. The 
second was based upon the judgments of these 
same four diagnostic categories made by five 
staff psychologists with four to nine years’ di- 
agnostic experience with the Rorschach test. 
The records to be judged were randomly 
drawn from the larger sample employed in the 
first approach with the proviso that none of 
these were originally obtained by any of the 
judges. All identifying materials were re- 
moved. These records were then divided 
equally into two groups with 60 cases in each. 
In the first of these, the psychograms and pro- 
tocols were separated, and scoring and loca- 
tion designations were removed. Records to be 
judged were assembled in groups of ten so 
that each assembly included five records from 
each of two of the four diagnostic categories 
which were to be evaluated. In this manner 
all combinations of diagnostic classifications 
were obtained. Each judge made 180 judg- 
ments, 60 under each judgmental situation. 
Judges were aware that the records were 
equally divided among the four diagnostic 
groups. They were, however, warned that each 
of the groups of 10 records would not be di- 
vided equally among the four diagnostic cate- 
gories. Each judge received only one record 
group per day, since it was believed that this 
procedure would relieve boredom and mini- 
mize any attempt to distribute judgments 
equally among the four diagnostic categories. 
Judges were instructed to place each of the 
psychograms, protocols, or combination of the 
two in one of the four diagnostic classifica- 
tions, employing any method desired and 
using any cues available from the presented 
materials. They were requested to keep no 


record of previous placements since this might 
influence their judgmental processes. 


Subjects 


The sample was drawn from World War II 
patients ranging in age from 20 to 45 years. 
Brain damaged or epileptic patients were ex- 
cluded as were those who received electro- 
shock, insulin coma, or chemotherapy treat- 
ment prior to their psychological test evalua- 
tions. A total of 809 records was used in the 
first approach which investigated the rela- 
tionship between Rorschach determinants 
and psychiatric classification. This sample 
included 140 paranoid schizophrenics, 341 
unclassified schizophrenics, 243 neurotics, and 
85 character disorders. These Ss were reason- 
ably well matched for education, age, and IQ. 
The following are means and standard devia- 
tions for each group: Paranoid Schizophre- 
nics: 10.37 years of school (SD 2.68), 30.47 
years of age (SD 5.34), and 105.11 IQ (SD 
14.19). Unclassified Schizophrenics: 9.86 
years of school (SD 2.43), 28.97 years of age 
(SD 6.20), and 101.77 IQ (SD 14.99). Neu- 
rotics: 10.24 years of school (SD 2.36), 30.68 
years of age (SD 6.38), and 108.71 IQ (SD 
12.29). Character Disorders: 9.97 years of 
school (SD 2.34), 29.25 years of age (SD 
6.42), and 106.58 IQ (SD 11.86). 

It should be noted here that the test diagno- 
ses and the classification arrived at in the final 
hospital diagnostic staff agreed perfectly. In 
the instance of the other two classifications, 
some intra-category variance was permitted; 
for example, testing for a specific patient may 
have suggested a diagnosis of character dis- 
order, passive-aggressive reaction, while the 
final staff diagnosis might have been character 
disorder, passive-dependent type. In this in- 
stance, however, there was agreement in the 
classification of character disorder. The same 
type of intra-category deviation was allowed 
within the neurotic category. 

For the judgmental approach, 120 cases 
were randomly drawn from the larger sample 
of 809 Ss. This sample was composed of 30 
cases from each of the four diagnostic cate- 
gories. These were closely matched for age 
and IQ and approximately for educational 
level and occupation. 
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Results 
Objective Analysis 


For the objective analysis, 28 factors were 
analyzed. These were: Total number of re- 
sponses, W, poor W, D, Dd, H, Hd, H plus 
Hd, poor H plus Hd, sexual responses, P, S, 
V, blends, rejections, content categories, color 
responses, M, poor M, animal movement, in- 
animate movement, sum of human, animal 
and inanimate movement, Y, poor Y, F%, 
A%, An, Hd >H, and hostile responses 
(characterized by violence, explosion, blood, 
fire, and other destructive forces). Chi-square 
analyses were computed between the four psy- 
chiatric groups for each of the above factors. 
Significant differences between diagnostic 
groups were found in only nine of these vari- 
ables and are presented in Table 1. 

Attempts to further refine these factors 
failed to be of value as they occurred too in- 
frequently to be of diagnostic use. Frequency 
distribution curves of the preceding distribu- 
tions were plotted for each diagnostic cate- 
gory and frequency cutoff points were se- 
lected where such curves showed their great- 
est divergence. Various groupings of these 
were then applied to 100 randomly chosen 
cases of the original sample, equally distrib- 
uted among the four diagnostic categories, to 
determine whether the diagnostic structure of 
of this sample could be predicted. However, 


only 31% of cases were accurately diagnosed 
by this approach, ranging from 28% for char- 
acter disorders to 34% for unclassified schiz- 
ophrenics. Use of determinant percentages 
failed to produce results differing from those 
given above. 

Since it was possible that certain determi- 
nant distributions on specific Rorshach plates 
might be of diagnostic value, a card-by-card 
analysis was undertaken. Some of the previ- 
ously utilized Rorschach factors were omitted 
since they occurred too infrequently. Results 
indicated that F + %, P, An, Hd, H, and W 
characteristics of specific Rorschach cards 
were significant in differentiating various di- 
agnostic groups. Again, when cutoff points 
utilizing patternings of these card determi- 
nant relationships were applied to 100 ran- 
domly drawn cases from the original sample, 
placement into diagnostic categories failed to 
be significantly better than chance. 


Judgmental Analysis 


Thirty records from each of the four diag- 
nostic groups, a total of 120 Rorschachs, were 
drawn from the original sample of 809 cases. 
As previously indicated, psychograms and 
protocols, separated in half of the sample, 
were judged independently. In the remaining 
half, the psychograms and protocols were 
judged together. 


Table 1 
Rorschach Factors Significantly Differentiating Between Diagnostic Groups 


Number 
Cc 

Diagnostic Group —W Hd Hd>H Sex P Hostile Responses F+% An 
Uncl. Schizophrenia vs. 

Par. Schizophrenia NS* p .03 p .05 NS NS p 01 NS NS NS 
Uncl. Schizophrenia vs. 

Neurotic NS p 02 NS p 03 p Ol NS p Ol p 01 NS 
Uncl. Schizophrenia vs. 

C. Disorder p 02 NS p 02 p 01 NS NS NS p 001 NS 
Par. Schizophrenia vs. 

Neurotic p O1 p 02 NS p .03 NS p 01 p .03 NS p 01 
Par. Schizophrenia vs. 

C. Disorder p02 p 02 NS p 01 NS p 02 NS NS > 05 
Neurotic vs. 

C. Disorder NS NS NS NS NS NS NS NS NS 
Over-all p 01 p O1 p O1 p 01 p Ol p 01 p 02 p O1 p 02 

* Not significant. 
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Table 2 


Correct Diagnostic Classifications of Judges Under Three Conditions of Judgment 


Paranoid Unclassified 
Schizophrenia Schizophrenia 


Neurotic Character Disorder 


All Diagnoses 


Proto- Psycho- Com- Proto- Psycho- Com- Proto- Psycho. Com- Proto. Psaycho- Com- Proto- Com- 


Judge cols grams bin. cols grams bin, cols 


All 
Judgments 28 


For this aspect of the study, the following 
are considered: (a) Accuracy of judgments 
for the four diagnostic ca.egories and again 
when combined into overall psychotic and 
nonpsychotic groups (these are compared 
under the conditions when judgments are 
based upon psychogram or protocol alone or 
a combination of the two); (4) judgmental 
bias; and (c) judgmental reliability. 

Accuracy of judgments. Data showing the 
accuracy of judgment under the various con- 
ditions for judgment are shown in Table 2 
and the distribution of all judgments is to be 


found in Table 3. Simple and multivariate 


analyses of variance failed to disclose any sig- 
nificant deviations from chance expectancy for 
the number of correct judgments made by 
judges for all Ss or for any of the diagnostic 
categories regardless of whether psychograms, 
protocols, or combinations of the two were 
used. Furthermore, no significant differences 
were found between judgments based on 
either psychograms, protocols, or both. Al- 
though missing the criteria of significance, 
some indications were present that the psy- 
chograms were somewhat better for the pre- 


grams bin. cole grams bin. cols grams bin, 
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diction of the neurosis and that the protocols 
permitted a somewhat more accurate judg- 
ment of paranoid schizophrenia. When judg- 
ments were considered for the prediction of 
psychosis (paranoid and unclassified schizo- 
phrenics combined) or nonpsychosis, analysis 
of variance disclosed that these were signifi- 
cantly better than chance (p < .05). No sig- 
nificant differences were found between any 
of the three conditions of prediction, although 
more correct judgments were obtained when 
protocols and psychograms were utilized in 
combination. 

A possible factor entering into the differen- 
tial judgmental situation is the similarity of 
or dissimilarity of the data to be judged at 
any one time. If one assumes that Rorschachs 
within a diagnostic group or within a psycho- 
tic-nonpsychotic category have greater simi- 
larity among themselves than to Rorschachs 
of other diagnostic categories, then judgments 
of a series of similar cases should be more 
difficult than those of a series of dissimilar 
cases. This difficulty would be accentuated if 
the judges had the expectancy that represen- 
tatives of any or all categories might be pres- 


Table 3 
Relationship of Actual Diagnoses to Judged Diagnoses by Five Judges 


Judged Diagnoses 


Paranoid Unclassified 


Schizophrenia 


Schizophrenia Neurotic 


Character Disorder 


Psycho- Proto- Com- Psycho- Proto- Com- Psycho- Proto- Com- Psycho- Proto- Com- 


Actual Diagnoses ls bin. 
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Paranoid Schizophrenia 
Unclassified Schizophrenia 
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Character Disorder 


All Subjects 
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ent in the data to be evaluated. For this rea- 
son, as previously pointed out, cases were as- 
sembled so that all combinations of categories 
were represented in the various blocks of data 
to be judged. Some were thus composed of 
both psychotic and nonpsychotic records while 
others contained only psychotic or nonpsycho- 
tic data. Analysis of variance, however, indi- 
cated that regardless of the combination of 
categories judged, no significant differences in 
the number of correct diagnostic predictions 
were present among the various combinations. 

Judgmental bias. The possibility that the 
type of setting in which a clinician works 
might bias the kinds of diagnostic judgments 
he might make was investigated. A compari- 
son of placements within the four diagnostic 
groups (see Table 3) by analysis of variance 
showed no significant differences between 
judges in utilization of a particular diagnosis. 
Similar analysis when data was regrouped into 
psychotic and nonpsychotic categories dis- 
closed that in the instance of the combined 
psychogram-—protocol condition, a significantly 
greater number of cases (p < .05) were clas- 
sified by each judge as psychotic. Although 
falling short of significance, a similar tend- 
ency toward disproportionate judgments of 
psychosis was found for the other two bases 
of judgment. 

Judgmental reliability. Reliability of judg- 
ments is presented from the point of view of 
the diagnostic correspondence of the judg- 
ments made separately from psychograms and 
protocols of the same patients and from over- 
all interjudge diagnostic agreement. Chi- 
square analysis of the 60 cases where proto- 
cols and psychograms were independently 
judged failed to disclose a significant corre- 
spondence of judgment by judges, considered 
individually or collectively, with respect to 
specific diagnoses or the psychotic—nonpsy- 
chotic dichotomy. Interjudge comparisons for 
each judgment condition likewise failed to 
show significant agreement among judges with 
respect to any single diagnostic category. 
Analysis, however, indicates that when only 
the psychotic—nonpsychotic dichotomy is con- 
sidered, agreements are significant at the p 
< .05 level. If the criterion of concensus is 
set as agreement of any three of the five 
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judges, both agreements with respect to spe- 
cific diagnoses and the psychotic—nonpsycho- 
tic category rose considerably, the former 
being significant at the p < .05 and the lat- 
ter at the p < .O1 level. However, agreements 
were yet far, short of reliability levels neces- 
sary for individual case predictions. 


Discussion 


The results of this study are disappointing 
from the standpoint of diagnostic predictions 
when Rorschach data is employed in the ab- 
sence of additional test materials. Findings 
are quite definite in that Rorschach data in 
isolation do not offer a sufficiently broad judg- 
mental base for consistent individual predic- 
tion, either within specific diagnostic catego- 
ries or within the psychotic—-nonpsychotic di- 
chotomy. 

In recent years, increasing emphasis has 
been put on the importance of content in the 
Rorschach. One might therefore have assumed 
that Rorschach protocols which furnish such 
data might enhance predictability, an assump- 
tion not supported by the results of this 
study. But perhaps this study is an unfair test 
of the usefulness of the Rorschach in that it 
does not duplicate the typical diagnostic pre- 
diction situation. In the typical situation, the 
evaluater may have available other psycho- 
logical test data and cues from interpersonal 
contact with the patient and his reaction to 
the testing situation which may serve to am- 
plify or modify conclusions from the Ror- 
schach data. 

The study, however, indicates the lack of 
consistency among judges in the diagnostic 
interpretation of identical data. If the inves- 
tigation had concerned itself only with pre- 
dictions in which judges expressed confidence, 
consistency might have been greater. 

Similar negative findings from the objective 
analysis perhaps were to be expected, since 
numerous objections may be raised to the 
validity of any procedure which would com- 
pile determinants to arrive at a diagnosis. 
Some significant relationships of determinants 
to diagnosis were found, but such significance 
was largely due to the size of the sample em- 
ployed and was not of value for making in- 
dividual predictions in this study. 
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Summary 

The consistency with which individual or 
group diagnostic categorization can be pre- 
dicted from the Rorschach was investigated 
in two ways; one was an objective statistical 
approach and the other a subjective judg- 
mental approach. In the first, an attempt was 
made to relate statistically either single or 
patterned Rorschach determinants to previ- 
ously made diagnostic judgments. The results 
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failed to uncover any useful means of arriving 
at a diagnosis. The judgmental approach was 
found to be equally unsuccessful in achieving 
consistent diagnostic predictions. 
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Psychologists concerned with the problems 
of the prediction of the future performance of 
individuals have, in recent years, become in- 
creasingly explicit in acknowledging the ne- 
cessity of taking personality factors into ac- 
count in predicting future achievement (9, 11, 
14). Along with this growing interest in the 
problem of the influence of personality vari- 
ables, there has been an increase in empirical 
studies of the relationships between scores on 
paper and pencil personality tests and meas- 
ures of achievement such as grade point av- 
erages (2, 6, 9, 11, 12). 

Perhaps the personality characteristic most 
studied in recent years has been anxiety (as 
defined by score on tests like the Taylor Anx- 
iety Scale). Since this variable has been shown 
to influence performance in numerous labora- 
tory situations (1, 4, 7, 8, 10, 13), it is rea- 
sonable that there would be interest in ex- 
tending our knowledge of its effects to “real 
life” behavior such as academic achievement. 
With respect to the Taylor scale, the results 
thus far have been disappointing. The most 
reliable studies reported in the literature indi- 
cate that level of anxiety has no demonstrable 
effect on academic achievement (9, 12). This 
is a rather surprising result in view of the fre- 
quent clinical observation that high anxiety 
leads to a breakdown or decline in achieve- 
ment. 

In light of this apparent failure of test- 
defined anxiety scores to predict achievement, 
it would seem fruitful to re-examine our no- 
tions as to what anxiety scales are measuring. 
In this regard, it seems that those of us who 


1The writer is indebted to Seymour B. Sarason 
for his cooperation in securing the data on which 
this report is based. 


have worked with tests like the Taylor scale 
have ignored one very important observation 
made by clinicians and laymen. That is, that 
people are not anxious every minute of the 
day and that often we can specify the condi- 
tions which will lead to an increase in anxiety 
in the individual. Perhaps what we need are 
not general anxiety scales oriented towards 
the kinds of anxiety responses (e.g., sweat- 
ing, awareness of an increase in tension, etc.) 
which an individual will admit to but, rather, 
tests designed to assess the specific condi- 
tions under which anxiety is aroused—or, of 
course, perhaps we need some combination of 
both. 

The present study was designed to evaluate 
the role of anxiety in academic achievement 
when anxiety is defined as a general charac- 
teristic and also as one specific to a particular 
situation. Scores on two questionnaires de- 
vised by S. Sarason (3, 5), a general anxiety 
scale and a test anxiety scale, were used to 
select extreme anxiety groups. College en- 
trance examination scores and grade point av- 
erages were used as response measures. It was 
expected that the test anxiety scale might be 
of more predictive utility in this situation 
than the general anxiety scale because of the 
test anxiety scale’s more intensive assessment 
of anxiety responses and their antecedents in 
a specific situation, the testing situation. Evi- 
dence presented by Sarason and Mandler (11) 
and Gordon (2) seems to indicate that this 
focusing on the specifics of the test situation 
can lead to some increase in the ability to 
predict such things as achievement in aca- 
demic situations. However, a direct compari- 
son of the predictive utility of the two kinds 
of anxiety scales (general and test) has as yet 
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not been attempted. In the present study, it 
was possible to make this kind of comparison. 

A word should be said at this point of the 
way in which we shall use the term anxiety 
and the kinds of predictions this definition 
leads to. Our knowledge of the role of anxiety 
on behavior is at present quite meager, even 
in relatively simple experimental situations. 
There is, however, a growing body of evidence 
which can be interpreted as suggesting that 
individuals obtaining high scores on anxiety 
questionnaires differ from other subjects (Ss) 
in the extent to which their performance is 
disrupted under conditions of stress (4, 5, 8, 
10, 11). It might further be inferred that an 
individual’s performance is disrupted to the 
extent that he brings to novel situations an- 
ticipations of failure, rejection, and inability 
to cope with the requirements of the situa- 
tion. If this were so, high anxious Ss would 
be regarded as emitting these interfering task- 
irrelevant responses (e.g., self-verbalizing, “I 
can’t pass this test”) to a greater extent than 
do other Ss in the anxiety score distribution. 
In the case where certain Ss admit to these 
interfering responses we would expect a lower 
level of performance for them than for other 
Ss in the score distribution. While the rela- 
tionships existing between general anxiety 
(i.e., anxiety experienced in a wide variety of 
situations) and test anxiety (anxiety specif- 
ically in testing situations) are as yet far 
from clear, we would expect that Ss with high 
test anxiety scores would do relatively more 
poorly than low test anxious Ss in their per- 
formance on important entrance examinations 
for which they could not prepare. Unless gen- 
eral and test anxiety were very highly cor- 
related, we would not necessarily expect in- 
dividuals admitting to anxiety in a variety of 
situations to be anxious in a testing situation. 
In any event, we would expect smaller differ- 
ences on entrance examinations for extreme 
general than for extreme test anxiety groups. 

A further prediction was made with respect 
to the grade-point averages of extreme test 
anxiety groups. On the assumption that high 
anxiety leads to performance decrements in 
novel situations for ~luch preparation is not 
possible, it was expected that high and low 
test anxious groups would not differ in grade 
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point average. This kind of result was ex- 
pected on the basis that in a particular class- 
room situation the deleterious effects of the 
novelty of test situations would decrease by 
virtue of the increase in familiarity with the 
teacher, classroom, etc., over a period of a 
semester. Another possibility leading to the 
same prediction is that high test anxious Ss’ 
anxiety may not be reduced during a course 
but that they may have sufficient time over 
the period of a semester to overlearn course 
material. 


Method 


The Ss were 305 Yale University liberal 
arts undergraduate students. Most of these 
Ss were administered the Test Anxiety (TA) 
questionnaire and the General Anxiety (GA) 
questionnaire in the fall of 1953. In some 
cases, only TA scores were obtained. In all 
cases, the anxiety questionnaires were ad- 
ministered during introductory psychology 
class meetings. At the time of taking these 
questionnaires, the great majority of Ss were 
sophomores or juniors. It is important to keep 
this fact in miad in interpreting the present 
results. Although the anxiety scales used are 
acceptably stable over time, we had no way 
of determining whether or not the same re- 
sults as those presented here would be ob- 
tained if the Ss used were largely freshmen 
or seniors. 

In the summer of 1956, as many of the fol- 
lowing measures as were available were re- 
corded for each S: (a) TA score, (6) GA 
score, (c) Scholastic Aptitude Test (SAT) 
scores (this test is largely verbal in nature) ,? 
(d) Mathematical Aptitude Test (MAT) 
scores,? and (e) yearly course grade point 
averages. 

Available for the 305 Ss were TA, GA, SAT, 
and MAT scores. However, grades for the four 
years of college were available for only 227 of 
these Ss.* Consequently, the Ns for the sta- 
tistical tests performed on the five measures 
listed above varied. 


2SAT and MAT are examinations given to all 
Yale undergraduates upon entrance to the University. 

3 Clearly, then, the results concerning GPA are 
generalizable only to those students who completed 
four years of college. 
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Results and Discussion 


In order to determine whether or not the anx- 
iety measures could be of utility in predicting 
academic performance, correlations (Pearson 
r’s) were obtained between the academic meas- 
ures and the GA and TA scores. Table 1 pre- 
sents the results of these analyses. Significant 
negative correlations were obtained between 
TA and the two entrance examinations (— .14 
and — .20). However, since it appeared that 
TA might be curvilinearly related to SAT and 
MAT, epsilons were computed between TA 
and SAT and MAT and tests made for curvi- 
linearity. In the case of TA and MAT, the re- 
sults of the statistical tests revealed that the 
hypothesis that TA and MAT are rectilinearly 
related could not be rejected. However, the 
departure from rectilinearity was significant 
(p < .01) for the TA-SAT relationship. The 
value of epsilon between TA and SAT was 
found to be .26. It thus appears that high TA 
scores do seem to be related to relatively poor 
performance on achievement tests of the type 
used in the study. On the other hand, the 
correlations presented in Table 1 between GA 
and the two entrance examinations were not 
significant. Therefore, with respect to the 
SAT and MAT measures, the predictions con- 
cerning the performance of Ss varying in TA 
and GA scores were largely supported. 

It had been expected that Ss with high TA 
scores would do relatively poorly on SAT and 
MAT but that this relative inferiority would 
disappear in course work as a function of 
either increased familiarity with or overlearn- 
ing of subject matter on which they would 
be tested during the course. The results of 


Table 1 


Correlations Between TA and GA Scores and 
SAT, MAT, and GPA* 


Anxi- GPA 
ety 
scale SAT MAT Yr. i 


Yr.2 Yr.3 Yr.4 


TA —.14* —.20°% —.14* —.17* —.06 —.003 


* Pearson r’s involving SAT and MAT have N = 305. 
Pearson r’s involving GPA's have N = 227. 


*p < 
< 01. 


Table 1 support this hypothesis, but it seems 
to take high TA Ss longer to “catch up” with 
other Ss than had been expected. For the first 
two years of college, significant negative r’s 
were obtained between TA and grade point 
averages. For the last two years these r’s fail 
to reach significance. 

A quite unexpected finding concerned the 
r’s between GA and grade point averages. It 
had been predicted that GA scores would be 
unrelated to grade point averages. This pre- 
diction was made on the basis that knowing 
that an individual is anxious in a variety of 
situations does not necessarily mean that he 
will be anxious in a testing situation. The high 
r of .55 obtained between GA and TA might 
provide some grounds for expecting a tend- 
ency for the results with respect to GA to be 
similar to those for TA. On the contrary, 
however, there seems to be a tendency for 
high GA scores to be associated with high 
grade point averages. This result gives strik- 
ing support to the contention that in dis- 
cussing the effect. of anxiety on performance, 
it is necessary to be clear as to the situations 
in which Ss admit to experiencing anxiety. 

One aspect of the results summarized in 
Table 1 which should be kept in mind is that 
all of the r’s except that between TA and GA 
are quite low. It is unlikely that r’s of such 
low order, even though they are significant, 
can be used for the purposes of prediction 
of the academic performance of individuals. 
However, such results can be of theoreticai 
utility. Because of the encouraging results pre- 
sented in Table 1, it seemed of interest to 
further test the hypotheses made with respect 
to the effects of TA and GA on academic per- 
formance using extreme anxiety groups since 
our predictions stemmed from certain expec- 
tations concerning the performance of high 
anxious Ss. Consequently, the performances on 
SAT and MAT of two extreme groups of high 
and low TA Ss were compared. Table 2 sum- 
marizes these results. When the upper 6% 
(H1) of the 305 Ss in the TA score distribu- 
tion was compared with the lower 7% (L1) 
of the TA distribution, the low TA group was 
significantly superior (p < .01) to the high 
TA group on both SAT and MAT. When Ss 
scoring on TA between the 8th and 17th per- 


se 

| 
) 
t 
. 

od q 

at 


488 Irwin G. Sarason 


Table 2 


Means and SDs for Two High and Two Low Test 
Anxiety Groups on SAT and MAT 


TA SAT 
Group 
and Score WN M SD M SD 


611.41 6681 653.21 70.21 


Li 0-7 22 

L2 8-10 606.05 71.12 624.34 79.63 
H2 2-28 38 588.94 74.63 002.31 84.73 
Hi 29-35 18 543.92 63.75 555.03 68.06 


centiles (L2) were compared with Ss scoring 
between the 82nd and 93rd percentiles (H2), 
no significant differences were obtained. It 
thus appears that the TA questionnaire may 
be sufficiently sensitive only in selecting from 
a large group of individuals a small subgroup 
of Ss who perform in the manner predicted in 
this study. It is interesting that this result is 
quite consistent with the findings in other 
studies that anxiety questionnaires divide Ss 
into two groups: a small high anxious group 
and a group composed of Ss in the rest of the 
anxiety score distribution. In this regard, it is 
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Fig. 1. GPA as a function of years in college for four 
TA groups. 


relevant to note that the two low TA groups 
(L1 and L2) did not differ significantly on 
either of the two entrance examination meas- 
ures, but that the Hi TA group was signifi- 
cantly inferior (p < .05) to the H2 TA group, 
and that this latter group did not differ sig- 
nificantly from the low TA groups. When 
groups of high and low scores on the GA 
questionnaire were compared with respect to 
SAT and MAT performance no significant 
differences were obtained. 

In view of the correlations between TA 
scores and grade point averages presented in 
Table 1, one might expect, for the first two 
years, to find a superiority of low to high TA 
groups followed by no difference in the third 
and fourth years. Fig. 1 presents the curves 
of the H1, H2, L1, and L2 TA groups. These 
curves represent grade point averages as a 
function of years in college for Ss scoring 
above the 91st percentile (H1), between the 
82nd and 91st percentiles (H2), below the 9th 
percentile (L1), and between the 9th and 
18th percentiles (L2) for the 227 Ss on whom 
grade point averages were available for four 
years. There are three observations to be made 
concerning Fig. 1. One aspect of it which is 
striking is the clearcut superiority of the Ll 
group over the L2 as well as over the two 
high TA groups. This superiority of the Ll 
TA group was found to be statistically sig- 
nificant. 

Another interesting finding represented in 
Fig. 1 is the general similarity in the curve of 
the L2 group to that of the H2 group. It had 
been expected that the LA and HA groups 
would not differ significantly when grade point 
average was used as the measure of academic 
achievement. Thus, to the extent that the H2 
group can be considered a high anxious group, 
the similarity in the L2 curve to the H1 and 
H2 curves is as was expected. Clearly, how- 
ever, the superiority of the L1 group to the 
other three groups was not expected. At this 
point one can only conjecture as to the mean- 
ing of the superiority of this group. It is pos- 
sible that a test-taking attitude (e.g., that 
tapped by the MMPI K scale) peculiar to 
L1 Ss may have contributed to their superi- 
ority in grade point average. No measure of 
test-taking attitude was available for the Ss 
used in this study. It would seem to be of in- 
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terest to take this variable into account in 
future research. es | 

In addition to the unexpected superiority 64 

of the L1 group to all other groups whose per- F 
formance is depicted in Fig. 1 is the marked 83 +} Hi (N= 20) 
difference in form of the H1 group curve from s2 | _/_sh2ine20) 
the other three. Most striking about this “YY sien) 
group’s performance is the great improvement 


in their performance in the second year fol- 
lowing an extremely poor first year showing. 
It had been expected that high anxious Ss 
would perform as well as low anxious Ss in 
course work on the assumption that during a 
semester the high anxious individual can in- 
crease his familiarity with the task, the in- 
structor, etc., as well as possibly overlearn 
course material. The curves in Fig. 1 suggest 
that the process of acclimatization for high 
anxious Ss may be considerably slower than 
we had expected. 

In order to follow up the unexpected sig- 
nificant positive correlations obtained between 
GA scores and grade point averages, two ex- 
treme high and two extreme low GA anxiety 
groups were compared in terms of the four 
yearly grade point averages obtained. Fig. 2 
presents curves for the groups of Ss scor- 
ing in the upper 8% of the GA score-dis- 
tribution (H1), the lower 8% (L1), Ss scor- 
ing in the upper 9th to 18th percentile range, 
and Ss in the 82nd to 91st percentile range 
(H2) of the GA score distribution. A large, 
highly significant, superiority of high to low 
GA Ss is clearly in evidence throughout the 
four-year period. This result is not in accord 
with the predictions made prior to carrying 
out the study and, together with the surpris- 
ing superiority of the L1 TA group to the 
other TA groups studied, poses important 
problems for future research. At present, it is 
unclear what individual difference variables 
are contributing to the superiority in achieve- 
ment of both individuals who admit to virtu- 
ally no anxiety in test situations and indi- 
viduals who admit to great anxiety in a wide 
variety of situations. 

It is clear, however, that it is important in 
discussing the effects of anxiety on perform- 
ance to specify the manner in which anxiety 
is measured (i.e., by means of which instru- 
ment). The results of the present study re- 
veal the importance of establishing the spe- 
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Fig. 2. GPA as a function of years in college for four 
GA groups. 


cific situations in which an individual experi- 
ences anxiety if one is interested in predicting 
his future performance in specific situations. 


Summary 


1. The relationships of anxiety as meas- 
ured by the Test Anxiety (TA) and General 
Anxiety (GA) questionnaires to entrance ex- 
aminations and grade point averages were 
studied. 

2. TA scores tended to correlate negatively 
with measures of academic achievement, al- 
though with increase in number of years in 
college the negative correlation disappeared. 
GA scores failed to correlate significantly with 
entrance examination scores, but tended to 
correlate positively with grade point averages. 

3. In studying extreme anxiety groups, high 
TA Ss performed at a significantly lower level 
than did low TA Ss. In addition, significant 
differences were found within each of the low 
and high test anxious groups. Thus, on en- 
trance examinations the most extreme test 
anxiety group performed at a significantly 
lower level than did a group of high anxious 
Ss with less extreme TA scores, and for course 
grades, the Ss with the lowest TA scores per- 
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formed at a considerably higher level than 
did low anxious Ss with less extreme scores. 

4. The results demonstrated that relation- 
ships between anxiety and achievement vari- 
ables depend to an important extent on the 
nature of the instrument employed to measure 
anxiety. 


Received February 8, 1957. 
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Generalization as a Function of Manifest Anxiety 
and Adaptation to Psychological Experiments’ 


Sarnoff A. Mednick’ 


Harvard University 


By attributing drive properties to anxiety, 
as measured by the Taylor Manifest Anxiety 
Scale (MAS), this variable has been incorpo- 
rated into the Hullian framework (8). In this 
context, defense conditioning investigations 
have indicated that individuals scoring high 
on the MAS condition faster than low scorers 
(5, 6, 7, 9). However, a study which utilized 
classical reward training of the salivary re- 
sponse, a situation relatively void of noxious 
stimulation, revealed no differences between 
low anxious (LA) and high anxious (HA) 
groups (1). Work by Rosenbaum and Wenar 
on stimulus generalization (SG) suggests that 
under strong stress conditions the HA group 
generalizes more than the LA group (4, 13). 
However, when Rosenbaum used only mild 
shock or buzzer, he failed to find a difference 
in generalization between the anxiety groups. 
These findings suggest that the HA scorer on 
the MAS does not chronically carry his anxi- 
ety around with him, bit that it must be spe- 
cifically elicited by some stress situation. This 
interpretation has been noted earlier (7, 12). 
The present study was undertaken to test for 
differences in SG as a function of scores on 
the Heineman forced-choice form of the MAS 
in a situation containing no stress deliberately 
introduced by E. 


Method 
Apparatus 


The SG apparatus was adapted from one 
devised and described in detail by Brown 


1This study was supported in part by a grant 
from the Laboratory of Social Relations, Harvard 
University. 

2Part of this study was completed while the 
author was on a USPHS postdoctoral Cooperating 
Institutions Fellowship. The cooperating institutions 
are Northwestern University, College of Medicine of 
the University of Illinois, University of Chicago, and 
Michael Reese Hospital. 


et al. (2). It consists of a plywood panel 
(6 it. xX 2 ft.), painted flat black, upon which 
is mounted a horizontal row of eleven lamps 
(115 v., 7.5 w.) spaced 9 degrees of visual 
angle apart. The lamps are designated 1 to 11, 
with Lamp 1 being on S’s left and Lamp 6 
being the center lamp. The panel is curved so 
that all lamps are equidistant from S’s eyes 
when he is seated directly in front of Lamp 6, 
3.5 feet away. A red-jeweled pilot lamp, 2 
inches above Lamp 6 serves as a fixation point 
and a ready signal. A reaction key was placed 
on S’s preferred side and he was allowed to 
move it into a comfortable position. Response 
latency was measured to the nearest 1/100th 
of a second with a Standard Electric Timer. 
The stimulus board effectively hid both E and 
the equipment from S. 


Procedure 


Ss were sixty undergraduate volunteers from 
Northwestern University psychology classes to which 
the Heineman forced-choice form of the MAS had 
been administered and scored by Key 2 (3). On the 
basis of their test scores, high anxious (HA), me- 
dium anxious (MA), and low anxious (LA) groups 
of 20 Ss each (evenly divided as to sex) were estab- 
lished. The median MAS scores for the HA, MA, 
and LA groups, respectively, were 70, 57, and 44. 
The median score of the original Heineman sample 
was 54. 

The instructions informed S that E was in- 
terested in how fast they were capable of re- 
acting. It was explained that they were to 
react by lifting their finger from a reaction 
key when Lamp 6 was lit and to continue 
holding down the key when any of the pe- 
ripheral lamps were lit. Speed was stressed 
again and Ss were urged not to worry about 
accidental responses to the peripheral lamps 
but to simply proceed to the next trial. There 
followed 20 training trials to Lamp 6 with a 
10- to 20-sec. intertrial interval. The fore- 
period between ready signal and stimulus was 
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Table 1 
Median Number of SG Responses 
Anxiety Groups 
Test All} 
Period High Medium Low Cases 
Tested early 6.5 9.0 5.5 6.5 
Tested late 4.5 6.5 4.5 5.5 
All cases 5.5 8.5 5.0 6.0 


varied between 2 to 5 sec. Without any warn- 
ing the 20 training trials were followed by a 
generalization test series consisting of four 
presentations of each of the 10 peripheral 
lamps. These 40 generalization test trials were 
interspersed among 48 “booster trials” with 
Lamp 6. Ten different orders, each beginning 
with a separate peripheral lamp, were used for 
the generalization test trials. One male S and 
one female S from each group was randomly 
assigned to each order. 


Because of scheduling arrangements with . 


other Es, the MA group was tested in the first 
and fourth weeks of the quarter (ten Ss each 
time) and the HA and the LA groups were 
tested in the sixth and ninth weeks of the 
ten-week quarter (ten Ss from each group 
each time). 


Results 


Table 1 compares the median number of 
SG responses given by the three anxiety 
groups. The median was used because of the 
presence of a few atypical individuals in the 
HA and MA groups who gave as many as 31 
SG responses, seriously influencing the mean. 
An analysis of variance by ranks showed the 
groups to differ significantly (chi square = 
6.52, 2 df, p < .05). Table 2 presents the gen- 
eralization data comparing the three anxiety 


Table 2 
Mean Number of SG Lamps Eliciting Responses 


Anxiety Groups 
Test All 
Period High Medium Low Cases 
Tested early 5.3 7.0 4.3 5.53 
Tested late 4.0 49 4.6 4.50 
All cases 4.65 5.95 445 5.02 


Mednick 


groups by simply counting for each individual 
the number of SG lamps eliciting responses 
during the 40 test trials (SG score). This 
score can only range from 0 to 10 and thus 
has the advantage of reducing the weight 
given to the aberrant individuals referred to 
above. These SG scores were submitted to an 
analysis of variance (Table 3) to test the sig- 
nificance of the differences between the anxi- 
ety groups. As can be seen from Table 3, the 
groups differed significantly. However, inspec- 
tion of Tables 1 and 2 makes clear that the 
significance of the difference is mainly due to 
the MA group demonstrating considerably 
more SG than the other two groups. The HA 
and LA groups apparently generalize at about 
the same level. 

In view of the experimental procedure, 
these results could either be attributed to the 
MA group having more drive than the LA and 


Table 3 


Analysis of Variance of Stimulus Generalization Scores 


Source df MS F 
Time of test 1 16.01 4.17* 
Anxiety groups 2 13.27 3.45* 
Interaction : 2 747 1.94 
Time X Anxiety 
Within 54 3.84 
*p < 05, 


HA groups or to the fact that they were tested 
earlier in the quarter. At Northwestern Uni- 
versity, students in elementary psychology 
courses are required to serve for five experi- 
ment hours and encouraged to serve up to ten 
hours. For the half of the MA group tested 
earlier, this study was the first they had ex- 
perienced, while the half of the MA group 
tested later had already served in two to three 
studies. For the HA and LA groups tested 
earlier this study was either the fifth, sixth, 
or seventh, while for the half of these groups 
tested later this study was either the ninth or 
tenth they had experienced. (The HA and LA 
groups were heavily tested that quarter; some 
Ss voluntarily participated in up to fourteen 
studies. ) 

In order to evaluate the factor of “time- 
in-quarter-at-which-SG-tests-took-place” each 
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Table 4 
Percentage of Group Responding at Each Test Lamp 


Test Lamps 
Anxiety 
Group 1 2 3 4 5 6 7 8 9 10 11 
HA 
Late Dn Dn DH @ 10 
All Dn 5S SS 70 100 on Ss 
MA 
Early 8: 8 8 #80 100 
LA 
Late 2 30 70 10 BW 


group was split into the half tested earlier 
(first week for MA group, sixth week for HA 
and LA groups) and the half tested later 
(third week for the MA group, ninth week for 
the HA and LA groups) in the quarter. The 
median number of SG responses for the early 
and late subgroups of the HA, MA, and LA 
groups are presented in Table i. The early 
groups are consistently more responsive. The 
mean SG scores for the early and late sub- 
groups of the HA, MA, and LA groups are 
presented in Table 2; the analysis of variance 
of the SG score data is presented in Table 3. 
The significance of the early-late variable sug- 
gests that time of testing in quarter influences 
amount of SG responsivity. The Anxiety 
Groups X Time of Testing interaction is not 
significant. 

Table 4 presents the data in terms of the 
percentage of the groups that responded at 
each test lamp. As can be seen, the early sub- 
groups are consistently more responsive than 
the late subgroups. The early HA Ss respond 
more than the early LA Ss. 

In view of the time of testing discrepancy, 
the results of the MA group are difficult to 
interpret. However, the HA and LA groups 
were tested at the same times. A Mann- 
Whitney U Test comparing early HA with 
early LA SG scores shows the early HA group 
demonstrating significantly more SG respon- 


sivity. (U = 22.5, p= .045, one-tail test.) 
The late HA group and late LA group did not 
differ significantly. (U = 41.5, n.s.) 

In summary, the results indicate that the 
HA and MA Ss that were tested early and had 
taken part in fewer previous studies (naive 
Ss) tended to generalize more than the HA 
and MA Ss who were tested later and had 
taken part in many studies (sophisticated Ss). 
The LA group showed no difference between 
sophisticated and naive Ss. The relatively 
naive HA Ss generalized significantly more 
than the naive LA group. However, with so- 
phisticated Ss this difference was not observed. 

Discussion 

The results may be interpreted as indicat- 
ing that a high MAS score predicts that anxi- 
ety may be elicited by “stress situations” and 
is not a chronic state which would manifest 
itself in any circumstance. A low MAS score 
predicts a relatively high threshold for anxiety 
elicitation. As Taylor has stated, “to many 
college sophomores psychology experiments 
per se may be seen as somewhat threatening” 
(12) p. 312. In view of this it might be specu- 
lated that the early HA Ss probably reacted 
with some anxiety to the testing while the 
early LA Ss did not, resulting in the observed 
differences in SG responsivity. However, after 
experiencing many psychological studies the 
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HA Ss’ situational anxiety adapted out, leav- 
ing no difference between the late HA and the 
late LA Ss. A similar adapting of the HA 
group has been observed informally by Taylor 
(11) and Spence (8). The adapting-out hy- 
pothesis would, of course, also explain the dif- 
ferences in SG between all the early Ss and 
the late Ss. 

It is possible that some variables other than 
number of psychological experiments covary 
with “early” and “late” experimental sessions. 
For example, one of these could be the trans- 
mission of information between Ss. However, 
this is not seen as a likely occurrence. None of 
the “late” Ss expressed recognition of the ex- 
periment. Even if transmission of information 
did occur its effect would probably not be 
great since the content would probably not 
differ in any important way from the instruc- 
tions provided S at the beginning of the test 
session. In addition, differential transmission 
of information in the HA and LA groups 
seems unlikely. 

The adaptation of the HA Ss to the experi- 
mental situation, resulting in a failure of drive 
arousal, has important methodological impli- 
cations for research with the MAS. It suggests 
that use of experimentally sophisticated indi- 
viduals as Ss will lessen the likelihood of the 
experimental variable producing drive differ- 
ences between the HA and LA Ss. Considera- 
tion of this variable may cast some light on 
otherwise inexplicable disparate results. 

Because of the unexpected findings, the 
original purpose of the study., testing for dif- 
ferences in SG as a function of MAS scores, 
has been put aside. The results seem to indi- 
cate that HA Ss, whose low threshold for anxi- 
ety arousal has not been adapted out, demon- 
strate more SG than comparable LA Ss. This 
finding is in agreement with Rosenbaum’s re- 
sults in finding differences in SG as a function 
of anxiety level (4). The studies disagree in 
that Rosenbaum only found the differences 
under very strong stress. 


Summary and Conclusions 


Groups of high, medium, and low anxious 
(HA, MA, and LA) Ss as measured by the 
Heineman forced-choice form of the Taylor 
Manifest Anxiety Scale (MAS) were tested 
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for stimulus generalization (SG). An unex- 
pected finding—the MA Ss showed more SG 
than HA and LA Ss—led to a re-examination 
of the data. The results of this tended to sup- 
port an interpretation which sees a high MAS 
score as indicating a low threshold for anxiety 
elicitation by a specific stress stimulus as op- 
posed to a chronic state. The results suggest 
that this low threshold can adapt out with 
repeated experience in the situation. 

A comparison of the HA and LA Ss that 
were relatively experimentally naive indicated 
that the HA group shows more SG than the 
LA group. 


Received March 12, 1957. 
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Self Concepts in Adjusted and Maladjusted 
Hospital Patients’’ 


Philip H. Chase* 
University of Colorado 


Difficulties in the development of satisfac- 
tory measures of adjustment level character- 
ize part of the criterion problem faced by re- 
searchers investigating therapeutic processes. 
This paper is concerned with one aspect of the 
utility of the Q technique as a measure of ad- 
justment. 

The comparison of self and ideal sorts has 
received much recent attention. Typically, the 
correlation between S’s Q sorts for his con- 
cepts of “self” and “ideal self” is used as an 
index of adjustment level. A low or negative 
correlation is assumed to indicate an unsatis- 
factory level of adjustment, and changes in 
the magnitude of correlation from before to 
after therapy are thus assumed to reflect 
changes in level of adjustment. 

Dymond (2, p. 84) has pointed out that 
the procedure may be invalid due to the con- 
tamination of posttherapy sorts by the thera- 
pist’s expressions of satisfaction with the pa- 
tient’s progress. Contamination of the thera- 
pist’s ratings of success in treatment, often 
used as criteria, may also occur as a function 
of invalid “well-adjusted” self-references on 
the part of the patient. 


1 Adapted from a dissertation submitted in partial 
fulfillment of the requirements for the degree of Doc- 
tor of Philosophy, University of Colorado, 1956. The 
author wishes to thank Dr. Victor Raimy, Dr. 
Dorothy Martin, Dr. William Scott, and Dr. Michael 
Wertheimer, members of his committee, for their 
help, and also Dr. Howard Siple and Dr. Lewis 
Bernstein, Denver VA Hospital, for their encourage- 
ment and assistance in making possible the collec- 
tion of the data. 

2 Published with permission of the Chief Medical 
Director, Department of Medicine and Surgery, Vet- 
erans Administration, who assumes no responsibility 
for the opinions expressed or conclusions drawn by 
the author. 

8 ‘low at the Veterans Administration Hospital, St. 
Clo. 4, Minnesota. 


The need for study of the relation between 
self and ideal self, independent of the psycho- 
therapeutic situation, was obvious and led to 
the inception of the present research. 


Method 


The Ss were male, hospitalized veterans 
with at least an eighth-grade education. No 
Ss were included in whom central nervous 
system damage was considered as a possible 
diagnosis. All Ss were selected, according to 
their availability, within two weeks following 
their admission to the hospital. 

The “maladjusted” group consisted of three 
subgroups: 19 psychotics, 20 neurotics, and 
17 patients with character or personality dis- 
orders. The “adjusted” group consisted of 50 
patients without evidence of psychiatric diffi- 
culties who were hospitalized on medical or 
surgical wards. The “adjusted” group was di- 
vided into random halves. 

The “adjusted” and “maladjusted” groups 
did not differ significantly in mean age or edu- 
cation, nor was there a significant difference 
in marital status. 

All Ss were administered the 50 self-referring 
items in Hilden’s set number 13 (4) with instruc- 
tions to sort the items for their concepts of self, ideal 
self, and average other person. With the exception 
of minor changes, the instructions and sorting pro- 
cedures were those recommended by Hilden and in- 
volved a seminormalized, nine-category distribution. 
Intersort correlations for each S were determined by 
Hilden’s method. Mean correlations for each group 
were determined through transformation to Fisher’s 
z scores. 

Three basic adjustment measures were de- 
rived from correlations between: sorts for 
concepts of self and ideal self (S-I), sorts for 
concepts of self and of the average other per- 
son (S-AQO), and sorts for concepts of ideal 
self and of the average other person (I-AQ). 

The self- and average-other-person sorts of 
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one-half of the “adjusted” Ss were each aver- 
aged to yield mean “normal” sorts for both 
concepts. Three additional measures were thus 
available from the correlations between: sorts 
for the concept of self and the “normative” 
self-sort (S-NS), sorts for the concept of self 
and the “normative” average-other-person sort 
(S-NAO), and sorts for the concept of the 
average other person and the “normative” 
ave; ige-other-person sort (AO-NAO). Mean 
correlations for each group were determined 
as for the first three measures. 

It was hypothesized that the three “mal- 
adjusted” groups, taken separately or to- 
gether as a single group, would yield mean 
correlations significantly lower on all six 
measures than those of the remaining 25 “ad- 
justed” patients who were not used in the de- 
velopment of the “normative” sorts. It was 
further hypothesized that the psychotics would 
yield the lowest mean correlations, followed 
by the neurotics, and that the mean correla- 
tions of the character group would be the 
highest of the three. Since none of the pa- 
tients were receiving formal, intensive psycho- 
therapy at the time of sorting, and since none 
had been hospitalized for over two weeks, it 
was assumed that the Q sorts were relatively 
free of the possibly contaminating influences 
of psychotherapy. The hypotheses were tested 
by the use of Fisher’s ¢ test for uncorrelated 
means. 


Results 


The first hypothesis was partially con- 
firmed. The S-I, S-AO, S-NS, and S-NAO 
mean correlations of all three “maladjusted” 
groups, taken singly or together, were signifi- 
cantly lower than those of the “adjusted” 
group at the .01 level. No significant differ- 
ences were observed for the I-AO and AO- 
NAO measures. 

The data, although suggestive of a trend in 
the expected direction, failed to confirm the 
hypothesis of differences between the sub- 
groups of the “maladjusted” sample. This 
trend appeared in the S-I, S-AO, S-NS, and 
S-NAO measures, but did not approach sig- 
nificance at even the .05 level. 

For the four discriminating measures, only 
Ss in the “maladjusted” group yielded indi- 
vidual correlations of zero or negative magni- 


Table 1 


Mean Correlations for All Measures 


Ad- Total Char. Neu- Psy- 
justed maladj. dis. rotic chotic 
Measure (V=25) (N=56) (V=17) (N=20) (V=19) 


S-I 642 362 403 386 .296 
S-AO 560 334 387 302 .295 
I-AO 593 566 595 590 510 
S-NS 656 434 446 427 427 


S-NAO__—«.618 352 390 343 327 
AO-NAO_ .648 591 643 572 556 


tude. While the “adjusted” Ss saw themselves 
as being different in varying degrees from 
their ideal or from others, only Ss with psy- 
chiatric difficulties ever saw themselves, not 
only as being different from the ideal or from 
the average other person, but in some cases 
as tending toward the opposite. 

Both “adjusted” and “maladjusted” Ss 
tended to have similar conceptions of the 
ideal self and of the average other person. 

Discussion 

Several limitations of the study should be 
noted. Firstly, the sampling cannot be as- 
sumed to be random since Ss were selected 
on the basis of being available for research 
purposes during their initial two weeks of 
hospitalization. Secondly, since all Ss were 
hospitalized, it is not known if similar non- 
hospitalized Ss might perform as those in the 
present study did. Thirdly, it is not certain 
that the use of the z statistic can be defended 
since independence of elements may not be 
assumed, but it seemed to be the most ap- 
propriate statistical method to choose for an 
analysis of the data. 

Nevertheless, the results suggest that psy- 
chiatrically maladjusted groups may be dis- 
tinguished from adjusted groups not only on 
the basis of the popular S-I measure, but 
equally well by other similar measures mak- 
ing use of self sorts. It also appears to be 
significant that only those measures including 
the self as a referent were capable of dis- 
criminating in this study. While it may be 
implied that the concept of self is significantly 
different in maladjusted persons, in contrast 
to their conceptions of the ideal self or of 
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the average other person, it remains to be 
seen if concepts relating to significant indi- 
vidual others, such as parents or spouse, would 
be also affected by maladjustment. 

Friedman’s data (3) indicate that the S-I 
correlations of paranoid schizophrenics are 
higher than those of neurotics and more like 
those of normals. The nine paranoid schizo- 
phrenics in the present study, however, did 
not perform significantly differently from the 
remainder of the psychotic group, and thus 
the present data fail to confirm Friedman’s 
findings. An opposite trend is suggested. Fur- 
ther study may indicate whether such factors 
as different item content or differences in 
chronicity and degree of illness will con- 
tribute to contradictory results. It may be 
significant that Friedman’s items can be said 
to be more pathologically oriented than those 
of the present study. Friedman‘ has sug- 
gested that sampling differences may have 
led to the contradictory findings. 

To summarize the findings of the present 
study, the “maladjusted” Ss, while tending to 
perceive the concepts of the ideal self and of 
the average other person much as the “ad- 
justed” Ss did, tended to perceive themselves 
as quite different from their ideals and from 
their concepts of the average other person. It 
is suggested that this dissimilar perception of 
self reflects a realistic appraisal of the self in 
relationship to other selves in contrast to the 
beliefs of many who hold that severely dis- 
turbed patients are incapable of such ap- 
praisal. 

The data in the present study may also be 
seen as confirmatory evidence for the hy- 
pothesis that lower self-esteem is closely as- 
sociated with maladjustment. If, following 
Butler and Haigh (1), the S-I correlation is 
accepted as an operational definition of self- 
esteem, it may readily be speculated that the 
other conceptual differences reflected in the 
remaining three discriminating measures are 
at least related to lower self-esteem in mal- 
adjusted Ss. Neither the S-I nor the S-AO 
measures, however, may be accepted at this 
time as having perfect validity where self- 
esteem is concerned. Some well-adjusted per- 
sons may have exceptionally high ideals. 


*Friedman, I. Personal communication. June 26, 
1956. 


Other well-adjusted Ss may correctly perceive 
themselves as quite different from the gen- 
eralized “other.” Lastly, maladjusted Ss may 
sort items so as to yield spuriously high cor- 
relations. 

Throughout this paper the adjustment meas- 
ures have been discussed in terms of group 
performance. While the “zdjusted” and “mal- 
adjusted” groups tended to have similar con- 
cepts of the ideal self and of the average other 
person, individual Ss occasionally produced 
sorts for these concepts quite deviant from 
the norm. It should be emphasized that esti- 
mates of adjustment level for single Ss may 
be distorted to the degree that ideal self- and 
average-other-person sorts, as well as self 
sorts, deviate from some adjusted group norm. 


Summary 


The present study represented an attempt 
to measure psychological maladjustment with 
Q-sort data yielding six adjustment measures 
utilizing concepts of self, of ideal self, and of 
the average other person. It was found that 
only measures containing the self sort could 
discriminate a group of “adjusted” from three 
groups of “maladjusted” hospitalized patients. 
“Maladjusted” Ss saw themselves as being 
different from their ideals and from their con- 
cepts of the average other person, while “ad- 
justed” Ss did not. Both “adjusted” and “mal- 
adjusted” Ss tended to hold similar concep- 
tions of the ideal self and of the average other 
person. 
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Change and Receptiveness to Psychotherapy’ 


Malcolm H. Robertson 


University of Mississippi 


Research has been largely focused on the 
process and evaluation of change during psy- 
chotherapy. If psychotherapy is a situation 
designed to hasten the process of change (1), 
then an equally pertinent problem is how 
one’s concept of change is related to recep- 
tiveness to psychotherapy. 

This study tried to determine whether 
there is a predictive relationship between a 
person’s stated global concept of change and 
receptiveness to psychotherapy. The term re- 
ceptiveness to psychotherapy is used to imply 
a specific attitude of acceptance or non- 
acceptance. 

The sample was divided into five groups. 
The first three groups were undergraduate 
students who had been asked on a question- 
naire whether they would accept psychother- 
apy if it were offered to them and whether 
they had had psychotherapy. The first group 
was comprised of 11 students who had never 
had therapy and would not accept it. The sec- 
ond group of 10 students had had therapy and 
would accept it. A third group represented 47 
students who had never had therapy but 
would accept it. A fourth group included 9 
subjects who had actually requested therapy 
but who had not had their first interview. A 
fifth group was comprised of 23 students who 
were undergoing therapy. The groups were 


1An extended report of this study may be ob- 
tained without charge from Malcolm H. Robertson, 
Department of Psychology, University of Mississippi, 
University, Miss., or for a fee from the American 
Documentation Institute. Order Document No. 5360, 
remitting $1.75 for microfilm or $2.50 for photo- 
copies. 


comparable in terms of age, sex, education, 
and socioeconomic status. 

The five groups were administered an eight- 
item questionpaire which measured their glo- 
bal concept of change. The groups were com- 
pared in terms of differences in the frequency 
of four response categories. Chi-square values 
were computed to test the significance of the 
differences. 

The prediction that the two acceptance 
groups would have a significantly stronger 
global concept of change than the nonaccept- 
ance group was confirmed. The prediction 
that the two acceptance groups would not dif- 
fer from one another and that the beginning 
therapy and therapy groups would not differ 
was also confirmed. Contrary to the predic- 
tions, the nonacceptance group did not differ 
significantly from the beginning therapy and 
therapy groups, and the beginning therapy 
and therapy groups did have a significantly 
weaker concept of change than the acceptance 
groups. Furthermore, those who had had more 
than 16 therapy sessions had a significantly 
weaker concept of change than did those who 
had had less than 16 therapy sessions. 

Results were evaluated both in terms of the 
methodological procedures and the hypothe- 
sized relation between concept of change and 
receptiveness to psychotherapy. 
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A Stereophonic Sound System for Play 
Therapy Observation 


Harold R. Green, John J. Hanson, and Julius Seeman 
George Peabody College for Teachers 


The purpose of this paper is to describe a 
sound system built to monitor play therapy 
interviews. 

Research in play therapy is at best a com- 
plex process; meaning must be derived from 
a subtle interplay of verbal and nonverbal be- 
havior. Often the problem is complicated by 
the sheer difficulty in hearing what is going 
on. The ordinary auditory problems in moni- 
toring are compounded in play therapy by at 
least three factors: (a) Children’s voices are 
sometimes quite low in volume and high in 
pitch. (6) Children often talk and play si- 
multaneously. The clank of metal or the 
splash of water may mask quite effectively 
what the child and therapist are saying. (c) 
The need to have a room which can take 
rough usage usually leads to the use of hard 
materials which increase echo. Fiber board 
and low-hanging drapes, both good for sound 
absorbency, are conspicuously absent from 
most play therapy rooms for good reasons. 

When the play therapy room at Peabody 
was built, the Audio-Visual department in- 
stalled the microphone and speakers which 
composed the single system sound pickup. 
The system had the limitations described 
above. Further consultation with the Audio- 
Visual department brought forth the sugges- 
tion to try the use of a binaural or dual sys- 
tem. This system has advantages which are 
quite critical in their value for monitoring 
play therapy from an adjoining room in con- 
junction with the use of a one-way vision 
mirror. 

The conventional one-channel sound system 
feeds all sound into a single input and ampli- 
fier. Thus all sounds compete directly for at- 


tention. This factor reduces the possibility of 
sound selection and sound localization. 

A stereophonic system reduces these diffi- 
culties. The basic plan of this system is to 
provide paired but separate lines from the 
sound to the listener, i.e., paired microphones, 
paired amplifiers, and paired earphones. This 
two-channel system allows the listener to 
bring into greater play one of the major fac- 
tors responsible for sound localization, namely, 
the phase or temporal factor. The two-channel 
input, properly placed, can make approxi- 
mately the same time discriminations as the 
human ears in picking up sound. This factor 
gives the listener a more “natural” situation 
and allows him to exercise the usual selective 
attention to sounds, “tuning out” such sounds 
as the scraping of furniture or the running of 
water when he wishes. It also leads to greater 
fidelity of sound reproduction. 

Licklider describes the effect of a two- 
channel system as follows: 

The idea of the simplest two-channel scheme is to 
record the sounds with two microphones, placed in 
the positions of the two ears of a dummy listener. 
When these sounds are led to a remote listener and 
applied to his ears via earphones, he gets almost ex- 
actly the same auditory picture that he would get 
if he were in the dummy’s position. The effect is very 
compelling. When someone with hobnailed boots 
walks past the dummy, the listener pulls in his feet 
to keep them from getting stepped on (1, p. 1030). 

Our system is similar to the one Licklider 
describes. It has the effect, then, of placing 
the listener aurally within the playroom itself. 

The technical requirements of the system 
are as follows: 


1. Paired microphones, placed horizontally 


12-15 inches apart and separated by a piece 
of fiberboard to reduce sound overlap. 
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2. Separate wires running from the micro- 
phones to paired amplifiers of characteristics 
similar to each other. Each microphone leads 
to one amplifier. 

3. Separate wires leading from amplifiers to 
earphones, one wire going to each earpiece. 
So long as the wires are kept separate, it is 
possible to use as many earphones as are 
needed for observation. Our system contains 
15 pairs of earphones, each with its own 
volume control so that the listener uu: adjust 
the sound to his own hearing comfort. The in- 
dividual volume controls are useful but not 
necessary to the system. 

It is necessary to mark each earpiece as 
“left ear” and “right ear” so that the listener 
does not reverse the earpiece and place on his 


Harold R. Green, John J. Hanson, and Julius Seeman 


left ear the earpiece leading to the right-hand 
microphone. This error cancels out localiza- 
tion, at least temporarily, though it does not 
impair audibility. 


Summary 


This article describes a two-channel sound 
system used to improve sound reception in 
monitoring play therapy interviews. 


Received February 28, 1957. 
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A Short Forced-Choice Anxiety Scale’ 


Richard Christie and Stanley Budnitzky 
Columbia University 


Bendig (1) has pointed out that but 20 of 
the 50 original items in the Taylor Anxiety 
Scale appear to have clinical validity, and pre- 
sented evidence that the reliability of these 
items is almost as high as of those in the origi- 
nal scale. Heineman (3) constructed forced- 
choice forms of the Taylor scale which largely 
eliminated extraneous variance traceable to re- 
sponse set and social acceptability. 

The advantages of both investigators’ work 
have been combined in a short forced-choice 
scale. Heineman’s item format was utilized 
for the 20 items designated by Bendig. Each 
anxiety item appears in a triplet. We fol- 
lowed Heineman’s procedure of having each 
S indicate which one of the three items was 
most true and which one was least true of 
self. The scoring differed slightly: designa- 
tion of a positively worded anxiety item as 
most tri + was scored plus one, designation as 
least true as minus one. Negatively stated 
items, i.e., acceptance was indicative of a lack 
of anxiety, were scored in opposite fashion. A 
constant of 20 was added to raw scores mak- 
ing the possible range from zero to 40. 

This scale has been given to four classes of 
medical students.? The split-half reliabilities 


1An extended report of this study, including the 
short scale, may be obtained without charge from 
Richard Christie, 605 West 115th Street, New York 
25, N. Y., or for a fee from the American Docu- 
mentation Institute. Order Document No. 5317, re- 
mitting $1.25 for microfilm or $1.25 for photocopies. 

2 This scale was developed in the course of research 
on the relationship between personality variables and 
performance among medical school students. This is 
a part of ongoing studies in the Sociology of Medi- 
cal Education by the Bureau of Applied Social Re- 
search of Columbia U.: versity under a grant from 
the Commonwealth Fun 


range from .65 to .84, the mean reliability of 
.75 comparing with that of .76 reported by 
Bendig. Heineman found lower reliabilities for 
one form of scoring and higher for another on 
50 items. 

The validity of the revised scale rests pri- 
marily upon Bendig’s synthesis of the work of 
Buss (2) and of Hoyt and Magoon (4). One 
inferential indication of validity comes from 
one class where pooled ratings of student per- 
formance in clinics were available.* Twelve of 
the 73 students were rated as outstanding and 
had significantly lower (.01 by Fisher’s exact 
test) scores on the scale than the nine poorest 
performers. Such a finding is consistent with 
the hypothesized relationship between anxiety 
and performance in complex situations. 

Inasmuch as the present modification com- 
bines the desirable features of both Heine- 
man’s and Bendig’s scales, its use is suggested 
when a short forced-choice anxiety scale is 
desired. 


Brief Report. 
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A Note on the Use of Doppelt’s Short Form of the 
WAIS with Psychiatric Patients’ 


David M. Sterne 
VA Hospital, Vancouver, Washington 


In February, 1956, Doppelt (1) reported 
the development of an abbreviated form of 
the Wechsler Adult Intelligence Scale with an 
effective simplified regression equation for 
predicting the Full Scale score. The data used 
were those gathered in the national standard- 
ization of the WAIS, a sample of normal in- 
dividuals designed to represent the United 
States’ population (2). The form utilizes four 
subtests: Arithmetic, Vocabulary, Block De- 
sign and Picture Arrangement; the first two 
chosen as best predictors of the total Verbal 
score, and the latter two as best predictors of 
the total Performance score. Correlation co- 
efficients between t!:c abbreviated form in its 
regression equation aud the Full Scale score 
ranged from .95 to .96 and the standard error 
of estimate obtained was about seven stand- 
ard score points. 

The use of the shortened WAIS was con- 
templated with psychiatric and neurological 
patients in a general medical and surgical 
hospital. Since it was conceded that inter- 
subtest variation with such a sample might 
affect the prediction efficiency of the Doppelt 
equaticu, the formula was applied to test data 
obtained from 35 male veterans with varied 
psychiatric diagnoses and 12 male veterans 
with neurological illnesses involving brain 
damage, all patients in a VA GM&S hospital. 
Thirteen of the psychiatric cases carried 


1From VA Hospital, Vancouver, Washington. 


schizophrenic diagnoses; the remainder were 
mainly psychoneurotic reactions or personal- 
ity disorders. The neurological cases included 
epileptics, traumatic brain damage cases, 
cerebral arteriosclerotics, multiple sclerotics, 
and similar conditions with organic brain 
involvement. 

Correlation coefficients between the sum of 
scaled scores on the four tests and the Full 
Scale scores were .97 for the psychiatric sam- 
ple, .88 for the neurological sample, and .96 
for the two combined. Standard errors of es- 
timate were 6.5, 8.0, and 7.1, respectively. 

Doppelt’s short form of the WAIS appears 
to be applicable and useful with male psychi- 
atric patients in a general medical and surgi- 
cal hospital. Although the number of neuro- 
logical cases is small, the results are suggestive 
that prediction with such cases would be some- 
what less accurate than with individuals whose 
performance is unhampered by organic brain 
involvement. 


Received March 18, 1957. 
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The Kahn Test of Symbol Arrangement as an 
Aid to Psychodiagnosis 


wae 


Paul D. Murphy, M. Richard Ferriman, and Russell W. Bolinger * 
USAF Hospital, Wright-Patterson Air Force Base, Ohio 


Five years have passed since Shoben re- 
viewed the Kahn Test of Symbol Arrange- 
ment and, on the strength of two publications 
(3, 8), wrote: “. . . the KTSA (Kahn Test 
of Symbol Arrangement) is a simpler, more 
widely applicable situation than most instru- 
ments on hand for investigation of develop- 
mental patterns and various attributes of 
psychopathological behavior. On a research 
basis it should be strongly encouraged. As a 
test it is still essentially unproven” (15, p. 
111). 

Shoben’s appeal for research with this in- 
strument may have played a part in stimu- 
lating subsequent studies which ranged from 
an investigation of epileptic children (1) to a 
study of parolees from a maximum security 
prison (5). The new clinical manual for the 
Kahn test describes fourteen recent studies 
which lend support to the early prediction of 
the instrument’s capacity to investigate psy- 
chopathology (13). Among those which in- 
terested the present authors are two studies 
in which blind psychodiagnosis was attempted 
by means of symbol patterns the test yields 
when the new method of scoring the test is 
employed (11, 12). By means of these pat- 
terns Kahn, Harter, Rider, and Lum claim 
they were able to differentially identify non- 
psychotics, schizophrenics, and patients with 
organic brain disease. In that study, 71.8% 
of an unknown group of 170 subjects were 
correctly classified into the above mentioned 
categories by means of blind sorting accom- 
plished by individuals who had had no train- 


1 Dr. Murphy was the chief of the neuropsychiatric 
service and the mental hygiene clinic at the time this 
study was conducted. The other authors were mem- 
bers of the neuropsychiatric team. 


ing in psychology or psychiatry (13, pp. 112- 
117). Another study reported in the same 
publication describes the successful identifica- 
tion of neurotics, normals, borderline schizo- 
phrenics, psychopaths, and psychotics by 
means of the Kahn test symbol patterns alone 
(13, pp. 117-119 and pp. 153-160). 


Method 


This study presents a modest attempt to 
repeat some of the earlier validation attempts 
in which blind analysis by means of symbol 
pattern was accomplished. 

The test consists of fifteen plastic objects 
which the subject must arrange five times on 
a felt strip having consecutively numbered 
segments ranging from 1 to 15. The adminis- 
tration and test materials are described in an 
earlier publication (8). 

The present study used a sample consisting 
of an unselected group of 48 patients who 
were classified into one of four categories by 
forced choice, using the patient’s Kahn sym- 
bol pattern and no other data. The group’s 
mean age was 24 with an SD of 8.7. The edu- 
cational level was 11.3 with an SD of 4.2. All 
of the patients were members of the military 
service; the majority were enlisted men. 


The symbol pattern consists of a number score and 
a series of letters. The number score represents 
weights derived by Kahn from ¢ ratio comparisons of 
clinical with normal groups (14). The letters repre- 
sent different levels of abstraction—or “symboliza- 
tion” as Kahn calls it (12). Frequency of occurrence 
determines the serial position of the letter in the 
pattern. The pattern can easily be derived from the 
psychograph which appears on the Individual Record 
Sheet furnished by the test publisher (12). 

The procedure of the study was as follows: Upon 
referral, patients were tested with the Kahn Test of 
Symbo! Arrangement before any diagnosis was es- 
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tablished. The patients’ symbol patterns were written 
on cards which were subsequently thoroughly mixed. 
These cards contained no other data except a number 
on the reverse side to identify the patient to whom 
the symbol pattern belonged. The sorter was not 
permitted to reverse the cards during the sorting; 
however, even if he should, by chance, have seen the 
number it would have had no meaning to him since 
the sorting was accomplished by a psychologist who 
had never seen any of the patients included in this 
investigation. The sorter followed the instructions 
presented by Kahn for sorting cards into nosological 
groups (13, pp. 153-155). The sorting results were 
matched against the final diagnoses arrived at by 
members of the psychiatric team excluding the 
member who had accomplished the sorting. In cases 
where there was doubt about the correctness of the 
final diagnosis, the patient was presented to qualified 
civilian consultants? and their diagnostic impression 
was used as the final one. 


Results 


The categories into which the patients’ 
symbol patterns were sorted were: neurosis 
(N), character and behavior disorder (C), 
organic brain disease (O), and schizophrenia 
(S). The results are given in Table 1. 

Table 1 shows that 38 out of 48 patients, 
or 79.2%, were classified correctly as com- 
pared with an expected 25% (since there were 
four categories) if the Kahn symbol pattern 
were of no value in classifying patients. If 
one assumes that the 48 patients constitute a 
random sample from a Gaussian population, 
one might say with 95% confidence that the 
proportion of the correct classifications in the 
population exceeds 69.5% which is nearly 
three times the chance expectancy. One would 
be disposed to describe the fact that we cor- 
rectly identified all of the four organics as a 
“lucky” coincident. Actually, however, there 
is just about one chance in 80 million that 
chance alone was operating in the correct 
identification of all of the four patients having 
organic brain disease. These findings support 
the earlier success various authors have had 
in identifying schizophrenics by means of the 
Kahn test (2, 3, 4, 16). Our ability to iden- 
tify patients with organic brain disease sup- 
ports previous work (6, 8, 9). Our accuracy 
in identifying neurotics roughly parallels the 
more extensive investigations which preceded 
our study (7; 13, pp. 109-110, pp. 117-119). 


2 Drs. Harry L. MacKinnon and Arnold Allen, as- 
sistent professors in psychiatry, School of Medicine, 
University of Cincinnati. 


P. D. Murphy, M. R. Ferriman, and R. W. Bolinger 


Table 1 


Blind Sorting of Mental Patients by ‘Kahn 
Symbol Patterns 


Sorting by Symbol Patterns 
Final 
Diagnoses N Cc Total 

N 11 3 0 3 17 
Cc 0 15 0 2 17 
O 0 0 4 0 4 
S 0 2 0 8 10 

Total 11 20 4 13 48 

Summary 


The symbol patterns yielded by the Kahn 
Test of Symbol Arrangement of 48 patients 
were blindly sorted by a member of the 
neuropsychiatric team of an Air Force hos- 
pital who had had no contact with the pa- 
tients. The sorter was able to correctly iden- 
tify a highly significant number of neurotics, 
persons with character and behavior disorders, 
organics, and schizophrenics. The results sup- 
port the work of earlier investigators who have 
used this technique and bear out the predic- 
tion made by an early reviewer of this test. 


Received July 24, 1957. 
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TEST 


New Tests 


Bennett, George K., Bennett, Marjorie G., Wallace, 
Wimburn L., & Wesman, Alexander G. College 
Qualification Tests (CQT). College entrants and 
students. 2 forms. 80 (105) min. IBM or hand 
scoring. Combined edition: test booklet ($5.00 per 
25), with manual, pp. 34, and keys; answer sheet 
($3.50 per 50); specimen set (60¢). Separate edi- 
tion, Verbal, Numerical, or Information: test book- 
let ($2.50 per 25 ea.), with manual, and keys; an- 
swer sheet ($1.90 per 50 ea.); specimen set (90¢). 
Form B restricted. New York: Psychological Corp., 
1957. 


The College Qualification Tests are a new resource 
for use by colleges in admission, placement, and 
guidance. Counselors may also use Form A of the 
tests in advisement about college entrance. Tests are 
sold to colleges who wish to schedule and carry out 
their own testing programs without the aid of an 
outside agency; the colleges are protected by the re- 
stricted distribution of Form B. The CQT consists 
of three subtests—75 vocabulary items yielding the 
V score, 50 items from arithmetic, algebra, and geom- 
etry giving the N score, and 75 information items 
(7) drawn about equally from the social and physical 
sciences and yielding two subscores. 

The Manual shows that the test conforms to high 
technical standards. Corrected split-half reliability is 
about .97 for the total score, and predictive validity 
for first semester grade point averages centers around 
.55, and varies from 34 to .71 in the 24 college sam- 
ples reported. The intercorrelations of the subtests 
range from .45 to .72, showing both homogeneity and 
a reasonable degree of independence. As yet, there 
are no data on the differential prediction of grades 
in various subjects from part scores. General percen- 
tile norms for admitted freshmen are derived from 
13,516 men and 7,996 women in 37 colleges located in 
22 states. Tables of norms are also given for 10 sub- 
groups of students, classified by type of college and 
curriculum. Correlations and tables of equivalent 
scores relate the CQT to the ACE and the SCAT. 


LOGICAL 


All evidence shows that the CQT represents a thor- 
oughly competent job of test construction, worthy of 
wide use. New forms are contemplated bienially, and 
technical bulletins will be issued as further evidence 
on the tests is obtained —L. F. S. 


Kahn, Theodore C. Kahn Test of Symbol Arrange- 
ment (KTSA). Age 6-adult. Individual test. 1 
form. (15-20) min. Record sheet ($7.50 per 50); 
general manual, pp. 36 ($2.00); clinical manual, 
pp. 74 ($3.00); complete set, including plastic ob- 
jects, felt strip, two manuals, and 10 record sheets 
($25.00). Grand Forks, N. D.: Psychological Test 
Specialists, 1956, 1957. 


The KTSA is an intriguingly novel method for 
clinical appraisal which combines some features of an 
objective test with some of a projective technique. 
The principal material consists of miniature plastic 
objects th .t suggest conventional meanings in our 
culture—hearts, stars, dogs, butterflies, an anchor, a 
cross, and a vaguely phallic shape. Under varying in- 
structions, the examinee arranges the objects five 
times on a felt strip having squares numbered from 
1 to 15, gives reasons for his arrangements, tells what 
the objects symbolize, and indicates the ones he likes 
and dislikes. The objective results are expressed as a 
single numerical score and a letter code which gives 
the rank order of nine scoring categories. The Gen- 
eral Manual gives clear and orderly instructions for 
administering and scoring the test. The Ciinical Man- 
ual discusses the rationale of the test, reports 15 re- 
search studies, and describes the interpretation of the 
results for the differential diagnosis of normality, 
neurosis, character disorder, schizophrenia, and psy- 
chosis with brain damage. In general, the greatest 
weight for normality comes from a good number of 
abstract symbolizations and the absence of bizarre, 
inferior, or blocked responses; psychopathology is 
indicated by the opposite trends. 

The 15 research studies, several of them not previ- 
ously published, give considerable information about 
the validity of the KTSA but have a number of 
shortcomings. Two studies show clear success in dif- 
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ferentiating brain damaged cases from normals. In 
one study, schizophrenics were not satisfactorily sep- 
arated from either normals or the brain damaged; in 
another all psychotics including organics were thrown 
together in unspecified proportions. Studies seem to 
give conflicting evidence as to whether neurotics are 
differentiated to a clinically useful degree. In no 
study was attention given to the troublesome prob- 
lem of the base rate. If the base rate for psychosis is 
1%, the test probably yields more false positives than 
true positives. 

The projective interpretations of the instrument 
are stated with becoming modesty, eg., “slanting 
hearts may indicate hostility to the opposite sex.” As 
with many other projective methods, such interpre- 
tations spring from clinical sense and may well lead 
to useful idiographic hypotheses. They are supported 
by no evidence. 

Although the KTSA has been under development 
for about ten years, it shows many signs of still being 
in a process of evolution. For example, the rules for 
interpretation given in the manual are not those used 
in any reported research study. The symbol pattern 
interpretations stated on a reference card supplied 
with the set even differ a little from those in the 
manual. The test is clearly an interesting device for 
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further research, but it is not yet ready for unquali- 
fied use—L. F. S. 


University of Pennsylvania, School of Education, 
Group B of the Suburban School Study Council. 
Pupil Adjustment Inventory. Rating scale for 
teachers’ use in rating pupils. 2 forms. Test pack- 
age; 35 short forms and 5 long forms, with man- 
ual, pp. 17 ($3.60); specimen set (80¢). Boston: 
Houghton Mifflin, 1957. 


The Pupil Adjustment Inventory provides a sys- 
tematic means by which teachers may rate the edu- 
cational and personal adjustment of elementary school 
and high school pupils. The short form, of 15 items, 
is for surveys, the long form, of 55 items, is intended 
for more detailed individual study. Each item is a 
five-step rating scale with each step defined by a 
brief verbal description. The content and form of the 
items seem sound. The Rater’s Manual gives adequate 
instructions for the use of the scales and anecdotes 
supporting their value; it is deficient in that it pro- 
vides no statistical data. The best use of the scales 
will be to increase the involvement of teachers in the 
thoughtful consideration of their pupils as persons. 
—L. F. S. 
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Journal of Applied Psycholegy 


Journal of Comparative & Physiological Psychology 
(2947-1950 only) 


vournal of Consulting Psy ~hology 
Journal of Experimental Psychology 
Psychological Abstracis 
Psychological Bulletin 


Psychological Index 
(a few complete volumes, some shopworn) 


Psychological Monographs 
Psychological Review 


Not all issues in all volumes are available. But—ORDER 
NOW before more back issues go out of print. From our 
available stock we will complete as much of your order as 
possible at this reduced price and for this limited period. 


Delivery: Neo dealex or quantity 
6 to 8 weeks discounts 
After this sale, for the years preceding 1948, journals will be available 
ouly on microfilm and mivrocard. 

Order from: 


American Psychological Association 
Department BB 
1333 Sixteenth Street, N.W. 
Washington 6, D. C. 
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NO CHILD CAN STAND ALONE ... he is part of the 
society in which he lives . . . the family into which he was 
born . . . the sociological complex whose customs and taboos 
may require more than he can achieve. 


_ The Devereux Foundation is the outgrowth of the recognition 
of the fact that exceptional! children need specialized assistance 
in adjusting to contemporary socicty. For them, Devereux 
offers, in a residential setting, the individualized, specialized 
educational, psychiatric, psychologica! and medical techniques 
which can help them return to useful lives im society. 


Professional inquires should be addressed to John M. Barclay, 
Director of Development, Devereux Schvols, Devon, Pennsyl- 
vania; western residents address Keith A. Seaton, Registrar, 
Devereux Schools in California, Sania Barbara, California. 


SCHOOLS 
COMMUNITIES 
THE DEVEREUX FOUNDATICA | 
A nonprofit organization rounded 1972 TRAINING 
Santa Barbara, California Devon, Pennsylvania RESEARCH 


Profesdional 
Associate Directors 


Robert L. Brigden, Ph.D. 


Chartes M, Campbeil, Jr.. M.D. 


EDY-ARD L, FRENCH. Ph.D., Director \ichael B. Dunn, Ph.D. 
Fred E. Henry, S.T.D. 
J. Clifford Scott, M.D. 
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